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Preface 


Sources in the Development of Mathematics: Series and Products from the Fifteenth to 
the Twenty-first Century, my book of 2011, was intended for an audience of graduate 
students or beyond. However, since much of its mathematics lies at the foundations of 
the undergraduate mathematics curriculum, I decided to use portions of my book as the 
text for an advanced undergraduate course. I was very pleased to find that my curious 
and diligent students, of varied levels of mathematical talent, could understand a good 
bit of the material and get insight into mathematics they had already studied as well 
as topics with which they were unfamiliar. Of course, the students could profitably 
study such topics from good textbooks. But I observed that when they read original 
proofs, perhaps with gaps or with slightly opaque arguments, students gained very 
valuable insight into the process of mathematical thinking and intuition. Moreover, the 
study of the steps, often over long periods of time, by which earlier mathematicians 
refined and clarified their arguments revealed to my students the essential points at the 
crux of those results, points that may be more difficult to discern in later streamlined 
presentations. As they worked to understand the material, my students witnessed the 
difficulty and beauty of original mathematical work, and this was a source of great 
enjoyment to many of them. I have now thrice taught this course, with extremely 
positive student response. 

In order for my students to follow the foundational mathematical arguments 
in Sources, I was often required to provide additional material, material actually 
contained in the original works of the mathematicians being studied. I therefore 
decided to expand my book, as a second edition in two volumes, to make it more 
accessible to readers, from novices to accomplished mathematicians. This second 
edition contains about 250 pages of new material, including more details within the 
original proofs, elaborations and further developments of results, and additional results 
that may give the reader a better perspective. Furthermore, to give the material greater 
focus, I have limited this second edition to the topics of series and products, areas that 
today permeate both applied and pure mathematics; the second edition is thus entitled 
Series and Products in the Development of Mathematics. 
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The first volume of my work discusses the development of the fundamental though 
powerful and essential methods in series and products that do not employ complex 
analytic methods or sophisticated machinery such as Fourier transforms. Much of 
this material would be accessible, perhaps with guidance, to advanced undergraduate 
students. The second volume deals with more recent work and requires considerable 
mathematical background. For example, in volume 2, I discuss Weil’s 1949 paper on 
solutions of equations in finite fields and de Branges’s conquest of the Bieberbach 
conjecture. Each volume contains the same complete bibliography. 

The exercises at the end of the chapters present many additional original results and 
may be studied simply for the supplementary theorems they contain. The exercises are 
accompanied by references to the original works, as an aid to further research. Readers 
may attempt to prove the results in the problems and, by use of the references, compare 
their own solutions with the originals. Moreover, many of the exercises can be tackled 
by methods similar to those given in the text, so that some exercises can be realistically 
assigned to a class as homework. I assigned many exercises to my classes, and found 
that the students enjoyed and benefited from their efforts to find solutions. Thus, the 
exercises may be useful as problems to be solved, and also for the results they present. 

Detailed study of original mathematical works provides a point of entry into the 
minds of the creators of powerful theories, and thus into the theories themselves. 
But tracing the discovery and evolution of mathematical ideas and theorems entails 
the examination of many, many papers, letters, notes, and monographs. For example, 
in this work I have discussed the work of more than three hundred mathematicians, 
including arguments and theorems contained in approximately one hundred works and 
letters of Euler alone. Locating, studying, and grasping the interconnections among 
such original works and results is a ponderous, complex, and rewarding effort. In this 
second edition, I have added numerous footnotes and almost five hundred works to the 
bibliography. My hope is that the detailed footnotes and the expanded bibliography, 
containing both original works and works of distinguished expositors and historians 
of mathematics, may encourage and facilitate the efforts of those who wish to search 
out and study the original sources of our inherited mathematical wealth. 

I first wish to thank my wife, who typeset and edited this work, made innumerable 
corrections and refinements to the text, and devotedly assisted me with translations and 
locating references. I am also very grateful to NFN Kalyan for his encouragement and 
for creating the eloquent artwork for the cover of these volumes. I greatly appreciate 
Maitreyi Lagunas’s unflagging support and interest. I thank Bruce Atwood who 
cheerfully constructed the nice diagrams contained in this work, and Paul Campbell 
who generously provided expert technical support and advice. I am grateful to 
my student Shambhavi Upadhyaya, who has an unusual ability to proofread very 
accurately, for spending so much time giving useful suggestions for improvement. 
I am indebted to my students whose questions and enthusiasm helped me refine this 
second edition. I also thank the very capable librarians at Beloit College, especially 
Chris Nelson and Cindy Cooley. Finally, I wish to acknowledge the inspiration 
provided me by my friend, the late Dick Askey. 
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q-Series 


25.1 Preliminary Remarks 


The theory of qg-series in modern mathematics plays a significant role in partition 
theory and modular functions as well as in some aspects of Lie algebras and statistical 
mechanics. This subject began quietly, however, with two combinatorial problems 
posed in a September 1740 letter from Phillipe Naudé (1684-1747) to Euler. Naudé 
was a mathematician of French origin working in Berlin. In general, his question was 
how to find the number of ways in which a given number could be expressed as the sum 
of a fixed number, first of distinct integers and then without the requirement that the 
integers in the sum be distinct. For example, in how many ways can 50 be expressed 
as a sum of 7 distinct/not necessarily distinct integers?! 

As an example of both these problems, 7 can be expressed as a sum of three distinct 
integers in one way, 1 + 2 + 4; whereas it can be expressed as a sum of three integers 
in four ways: 14+ 14+5,1+2+4,14+3+43,2+2-+ 3. Euler received Naudé’s 
letter in St. Petersburg, just before he moved to Berlin. Within two weeks, in a reply to 
Naudé, Euler outlined a solution and soon after that he presented his complete solution 
to the Petersburg Academy.” In 1748, he devoted a whole chapter to this topic in his 
Introductio in Analysin Infinitorum.> The essential idea in Euler’s solution was that 
the coefficient of qx’ in the series expansion of the infinite product 


f(@x) = Ut qxd+q?x)d + 97x) ++ (25.1) 


gave the number of ways of writing k as a sum of m distinct positive integers. Euler 
used the functional relation 


f(q,.x) =U + 9x) f(9,9x) (25.2) 


! See Eu. 1-2 pp. 163-193, especially § 19, E 158 § 19 and Weil (1983) pp. 276-277. 
2 Bu. 1-2 pp. 163-193, E 158. 
3 Buler (1988) chapter 16, especially pp. 256-270. 
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to prove that 


oo mnt) 
qd xX 
(qx) = (25.3) 
m 2» G90 a0 a) 
He noted that 
1 
(gil 9?) sg”) 
=(ltqtgith ed tq? tree.) dt g™ + gmt...) 
[o,0) 
= hang”, (25.4) 
n=0 


where the middle product showed that a, the coefficient of g”, was the number of 
ways of writing n as a sum of integers chosen from the set 1,2,...,m. This implied 
that the coefficient of g*x” on the right-hand side of (25.3) was the number of ways 
of writing k — mim I) as a sum of integers from the set 1,2, ...,m. Thus, Euler stated 
the theorem: The number of different ways in which the number n can be expressed 
as a sum of m different numbers is the same as the number of different ways in which 

— mint I) can be expressed as the sums of the numbers 1,2,3,...,m. 

For the second problem, Euler used the product 


g(q.x) =] [d-¢"x)" (25.5) 
n=1 


and obtained the corresponding series and theorem in a similar way. Euler here used 
functional relations to evaluate the product as a series, just as he earlier employed 
functional relations to evaluate the beta integral as a product. Of course, this method 
goes back to Wallis. 

Euler also considered the case x = 1. In that case (in modern notation), we have 


CO 
[ [d-a t= tat att dt (497 4. )UFE GEG toyee 
n=1 


= > pina”, (25.6) 
n=0 


where p() is the number of partitions of n, or the number of ways in which n can be 
written as a sum of positive integers. For example, p(4) = 5 because 4 has the five 
partitions 


The product in (25.6) also led Euler to consider its reciprocal, | bee ,(1 — q"). He 
attempted to expand this as a series but it took him nine years to completely resolve 
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this difficult problem. In his first attempt presented in 1741 but published ten years 
later,* he multiplied a large number of terms of the product to find that 


Ce 
[]u-a)=1 g—-@tqi tq! —q?—ghtg2+q%—qh—ge tg +---. 


n=! 


(25.7) 

He quickly found a general expression for the exponents, mame) He most 
probably did this by considering the differences in the sequence of exponents; note 
that the sequence of exponents is 


0, 1, 2,5, 7, 12, 15, 22, 26, 35, 40, 51,.... 
Observe that the sequence of differences is then 
1,.15.3,,2,°5,. 3,7; 4, 9,5, 11-065. 


The pattern of this sequence suggests that one should group the sequence of 
exponents into two separate sequences, first taking the exponents of the odd-numbered 
terms and then the exponents of the even-numbered terms.> For example, the 
sequence of exponents of the odd-numbered terms is 0, 2, 7, 15, 26, 40, ..., and their 
differences are 2, 5, 8, 11, 14,.... Since the differences of these differences are 3 in 
every case, we may apply the formula of Zhu Shijie and Montmort, given in Section 
10.3, to perceive that the (n + 1)th term of the sequence of odd-numbered exponents 
will be given by 


ange = Te inne |) 


0+2n4 
: 2 2 


Similarly, the nth term in the sequence of even-term exponents is a In the 


Introductio, Euler wrote, “If we consider this sequence with some attention we will 
(3n2-£n) 
2 


note that the only exponents which appear are of the form and that the sign of 
the corresponding term is negative when n is odd, and the sign is positive when n is 
even.”© Thus, Euler made the conjecture 


[o,2) [oe] [o,e) 
Tla-a= cong 14 Cv, 25.8) 
n=l m=1 


m>=—CO 


and finally found a proof of this in 1750. He immediately wrote Goldbach about the 
details of the proof,’ explaining that it depended on the algebraic identity: 


(1 —a@)(1 — B)(1 — 8)ete. 
=L-a—pPl=-o)—yl=a—p)-60.—e)C =p) — y¥)— ele, 


4 Bu. 1-2 pp. 163-193, especially p. 193. E 158 § 37. 

5 Bu. 1-2 pp. 241-253. E 175 § 8. 

6 Euler (1988) p. 274. 

7 Fuss (1968) vol. 1, pp. 522-524. See also Eu. 1-2 pp. 390-398. E 244. 
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This identity is easy to check, since the first three terms on the right-hand side add 
up to 


1—a— B(l—a) = (1 —a@)(1 — £), 
and when this is added to the fourth term, we get 


(l—a@)(—-~)-yd—-—a@)d—-~)=1—-a@)d—- £6) —-y), 


and so on. An interesting feature of the series (25.8) is that the exponent of qg is a 
quadratic in m, the index of summation. Surprisingly, series of this kind had already 
appeared in 1690 within Jakob Bernoulli’s works on probability theory,® but he was 
unable to do much with them. Over a century later, Gauss initiated a systematic study 
of these series. Entry 58 of Gauss’s mathematical diary, dated February 1797, gives a 


continued fraction expansion of one of Bernoulli’s series:? 
tga? 4a"? 
1 
a ai (25.9) 
14 ; 
id a“ —a 
T te 
A a* — a 
1+ 
a 
i 
1+ etc. 


In his diary, Gauss added the comment, “From this all series where the exponents 
form a series of the second order are easily transformed.” About a year later, he raised 
the problem of expressing 1 + g + g? + q® + q!° +--+ as an infinite product. 
Gauss came upon series of this type around 1794 in the context of his work on the 
arithmetic-geometric mean, that he had been studying since 1791.!° This latter work 
was absorbed into his theory of elliptic functions. Series (25.8) and (25.9) are actually 
examples of the special kind of q-series called theta functions. Theta functions also 
arose naturally in Fourier’s 1807 study of heat conduction. 

Unfortunately, Gauss did not publish any of his work on theta or elliptic functions, 
and it remained for Abel and Jacobi to independently rediscover much of this work, 
going beyond Gauss in many respects. Around 1805-1808, Gauss began to view 
q-series in a different way. For example, his 1808 paper!! on q-series dealt with a 
generalization of the binomial coefficient and the binomial series. In particular, he 
defined the Gaussian polynomial 


8 Bernoulli and Sylla (2006) pp. 176-180. 
9 See Dunnington (2004) p. 474. 
10 Peters (1860-1865) vol. 1, p. 125. See also Gauss (1863-1927) vol. 3, pp. 361-371, also vol. 10, part 2, 
p. 18 of Schlesinger’s article on Gauss’s work in function theory. 
11 Gauss (1981) pp. 463-495. 
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ds gl 9- 0 29" age 


(m, fh) = (25.10) 
ga q50 9). = 9h) 
Note that Gauss wrote x instead of g. Observe that as g > 1 
(m, 2) > Co) (25.11) 


This work led to an unexpected byproduct: an evaluation of the Gauss sum 


se. = where n was an odd positive integer. This sum had already appeared 
naturally in Gauss’s theory of the cyclotomic equation x” — 1 = 0, to which he had 
devoted the final chapter of his 1801 Disquisitiones Arithmeticae.'* There Gauss had 
computed the square of the Gauss sum, but he was unable to determine the correct sign 
of the square root. Already in 1801, he knew that it was important to find the exact 
value of the sum; he expended considerable effort over the next four years to compute 
the Gauss sum, and it was a complete surprise for him when the result dropped out 
of his work on q-series. In September 1805, he wrote his astronomer friend, Wilhelm 
Olbers,!3 


What I wrote there [Disqu. Arith. section 365] ..., I proved rigorously, but I was always annoyed 
by what was missing, namely, the determination of the sign of the root. This gap spoiled whatever 
else I found, and hardly a week may have gone by in the last four years without one or more 
unsuccessful attempts to unravel this knot - just recently it again occupied me much. But all the 
brooding, the searching, was to no avail, and I had sadly to lay down my pen again. A few days 
ago, I finally succeeded - not by my efforts, but by the grace of God, I should say. The mystery 
was solved the way lightning strikes, I myself could not find the connection between what I knew 
previously, what I investigated last, and the way it was finally solved. 


He recorded these events in his diary:!+ 


(May 1801) A method for proving the first fundamental theorem has been found by means of a 
most elegant theorem in the division of the circle, thus 


(25.12) 


sin nn Ja 0 
Lig oe | 0] 0 


COS a Ja | +/a 
according as a = 0, 1, 2, 3 (mod 4) substituting for n all numbers from 0 to (a — 1). (August 


1805) The proof of the most charming theorem recorded above, May 1801, which we have sought 
to prove for 4 years and more with every effort, at last perfected. 


hoe 


Conceptually, this was a major achievement, since it served to connect cyclotomy with 
the reciprocity law. Gauss may have initially considered the polynomial )~7"_9(m, k)x* 
as a possible analog of the finite binomial series. In any case, he expressed the sum 
as a finite product when x = —I and when x = ,/q, and these formulas finally 
yielded the correct value of the Gauss sum. It is interesting to note that the polynomial 


12 For an interesting commentary on Gauss’s work in cyclotomy, see Neumann (2007a)) and (2007b). 
!3 See Biihler (1981) p. 31. 
1@ Dunnington (2004) p. 481. 
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reo, k)x* played a key role in Szegé’s theory of orthogonal polynomials on the 
unit disc. 

Gauss found the appropriate g-extension of the terminating binomial theorem, 
perhaps around 1808, but he did not publish it. In 1811, Heinrich A. Rothe (1773- 
1841) first published this result in the preface of his Systematisches Lehrbuch der 
Arithmetik!> as the formula 


m 


l—-g™ 1-— m-—-1 t= m—k-+1 kk) 
) zs , qd lie ae oss te q -g F xt ys 
fog. Tg" Lege. 


k=0 
=(x+y)(atqy)---+qk1y). (25.13) 


Although this was the most important result in the book, Rothe excluded it from 
the body of text, apparently in order to keep the book within the size required by the 
publisher. Gauss’s paper and Rothe’s formula indicated a direction for further research 
on q-series relating to the extension of the binomial theorem. This path was not 
pursued until the 1840s, except in Schweins’s Analysis of 1820.!° This work presented 
a g-extension of Vandermonde’s identity (25.64). 

In the 1820s, Jacobi investigated q-series in connection with his work on theta 
functions, a byproduct of his researches on elliptic functions. His most remarkable 
discovery in this area was the triple product identity. Jacobi’s famous Fundamenta 
Nova of 1829 stated the formula as!7 


3 5 
(aad +a +q%s--(1 ! ‘) (1 age! )( ee Js 
v4 v4 z 


Ita(z+4)+a*(2+5) +0 (3+45)4-- 
~ Gd — 42) —q) — 4) — 98) - 


Jacobi regarded this identity as his most important formula in pure mathematics. He 
gave several very important applications. In one of these, he derived an identity, 
giving the number of representations of an integer as a sum of four squares. In 
another, he obtained an important series expression for the square root of the period 
of some elliptic functions, allowing him to find a new derivation of the following 
transformation of a theta function, originally due to Cauchy and Poisson: 


CO 3 1 CO 2 
142) °e" ne (aaye*), (25.15) 
n=1 vx n=1 


Jacobi also published a long paper on those series whose powers are quadratic 
forms; the triple product identity formed the basis for this. In the 1820s, when Gauss 


(25.14) 


'5 Rothe (1811). 
16 Schweins (1820) pp. 292-293. 
17 Jacobi (1969) vol. 1, p. 234. 
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learned of Jacobi’s work, he informed Jacobi that he had already found (25.14) in 
1808. Legendre, on very friendly terms with Jacobi, refused to believe that Gauss 
had anticipated his friend. In a letter to Jacobi,!® Legendre wrote, “Such outrageous 
impudence is incredible in a man with enough ability of his own that he should not 
have to take credit for other people’s discoveries.”!? Then again, Legendre had had his 
own priority disputes with Gauss with regard to quadratic reciprocity and the method 
of least squares. 

In the early 1840s, papers on g-series appeared in quick succession by Cauchy in 
France and Eisenstein, Jacobi, and E. Heine in Germany. As a second-year student at 
Berlin in 1844, Eisenstein presented twenty-five papers for publication to Crelle’s 
Journal. One of them, “Neuer Beweis und Verallgemeinerung des binomischen 
Lehrsatzes,””° It began with the statement and proof of the Rothe-Gauss theorem; it 
then applied Euler’s approach to the proof of the binomial theorem to obtain a version 
of the g-binomial theorem. Some details omitted by Euler in his account were treated 
in Eisenstein’s paper. 

Jacobi and Cauchy stated and proved the g-binomial theorem in the form 


1 Vow, 4, @= WO =u). U-We-q)e-w) 
t ie | 5) a | 5) 3 
L=@ dg =@) d= q)(l—¢ 0 =") 
_ (l= wz)(1 — qwz)(1 — q?wz)(1 — q3wz)--- 
~ (l= vz)(1 — quz)(1 — q2vz)(1 — g3vz)-++ | 


(25.16) 


The idea in this proof was the same as the one used by Euler to prove (25.3), clearly 
a particular case. Jacobi also went on to give a g-extension of Gauss’s 2 F; summation 
formula. At that time, it was natural for someone to consider the qg-extension of a 
general » F, hypergeometric series; E. Heine did just that, and we discuss his work in 
Chapter 27. 


25.2 Jakob Bernoulli’s Theta Series 


It is interesting that the series with quadratic exponents, normally arising in the theory 
of elliptic functions, occurred in Bernoulli’s work in probability. In 1685, he proposed 
the following two problems in the Journal des Savans: 


Let there be two players A and B, playing against each other with two dice on the condition that 
whoever first throws a 7 will win. There are sought their expectations if they play in one of these 
orders: 


(1) A once, B once, A twice, B twice, A three times, B three times, A four times, B four times, 
etc. 
(2) A once, B twice, A three times, B four times, A five times, etc. 


18 Jacobi (1969) vol. 1, pp. 396-399, especially p. 398. 
'9 For the English translation, see Remmert (1998) p. 29. 
20 Eisenstein (1975) vol. 1, pp. 117-121. 
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In his Ars Conjectandi, Bernoulli gave a solution, saying that in May 1690, when 
no solution to this problem had yet appeared, he communicated a solution to Acta 
Eruditorum.' In the first case, Bernoulli gave the probability for A to win as 


l—m+m*—m*+m m +m m-+m m~ + etc. (25.17) 


1—-m+m—m?+m—m>+m m?> + m* — m® + etc. (25.18) 


In both cases, m = . To make the quadratic exponents explicit, write the two series as 


lee) lee) lee) 
n(n+1) 
1+ So mreth — Som" and So(-1)"m a 
f= n=1 n=0 


Bernoulli remarked that the summation of these series was difficult because of the 
unequal jumps in the powers of m. He noted that numerical approximation to any 
degree of accuracy was easy and form = ;, the value of the second series was 
0.52393; we remark that this value is inaccurate by only one in the last decimal 
place. Jakob Bernoulli was very interested in polygonal and figurate numbers; in fact, 
he worked out the sum of the reciprocals of triangular numbers. Here he had series 
with triangular and square numbers as exponents. Gauss discovered a way to express 
these series as products. Euler found the product expansion of a series with pentagonal 
numbers as exponents. 


25.3 Euler’s g-Series Identities 


In response to the problems of Naudé, Euler proved the two identities: 


(+qxyd+q?x)d + qx) + qtx)--: 


=-14 q | th 24 | goa m | 
— } x- x7 +++. x powee, 
bag dag) (1 —q)---(1—q™) 
(25.19) 
1 
(b= gx gem (agen) 
=|4 q | q 24 | q” m | 
Sha x7 Xr xX 
Log. =o o¢-) (1—q)---(1—q™) 
(25.20) 


Euler’s argument for the first identity was outlined in the opening remarks of this 
chapter. His proof of his second identity ran along similar lines. We here follow Euler’s 


21 Bernoulli and Sylla (2006) pp. 176-180. 
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presentation from his /ntroductio,”* noting that Euler wrote x for our g and z for our x. 
Note also that the term g-series came into use only in the latter half of the nineteenth 
century, appearing in the works of Cayley, Rogers, and others. Jacobi may possibly 
have been the first to use the symbol g in the context of elliptic functions, though he 
did not use the term q-series. Euler let Z denote the infinite product on the left of 
(25.20) and he assumed that Z could be expanded as a series: 


Fi PR AOK ERY HS Bee. (25.21) 
When x was replaced by qx in Z, he got 


1 
(grey gx) = "x)= 


Making the same substitution in (25.21), he obtained 


= (1 —qx)Z. 


(—qx)Z=1+ Pqx+ Oq?x* + Rq?x? + Sq*x* tere, (25.22) 


When the series for Z was substituted in (25.22) and the coefficients of the various 
powers of x were equated, the result was 


P R 
I eae ey Oq ee 
l-q 1—q? 1-—q 1—q4 


and this proved (25.20). 


25.4 Euler’s Pentagonal Number Theorem 


Pentagonal numbers can be generated by the exponents (Sick) in Euler’s formula 


(25.8) 


ee) ee) dts 
[Ja-29=14 0p". (25.23) 
n=1 m=1 


This identity is often referred to as the pentagonal number theorem. 

Recall that Euler had conjectured this result in 1741; he was convinced that this 
formula was valid, but he could not prove it. He was so confident of his conjecture 
that in 1747, he used this formula to prove a remarkable theorem on the sum of the 
divisors of an integer.*? Concerning this theorem, he remarked in section 25.9 of his 
paper, “Indeed, I have no other proof.” 

To understand the 1747 theorem in which he used his conjecture, let n be a 
nonzero integer and let o(n)= >> d\n @- Observe that if n were a negative integer, 


22 Euler (1988) pp. 361-363. 
23 See Euler’s letter to Goldbach: Fuss (1968) vol. I, pp. 407-408. Also see his paper E 175. 
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then o(n) =0. We remark that Euler’s own notation for o(n) was a n. Thus, for n a 
positive integer, Euler’s formula was 


o(n) =o(n—1)+0(n—2)—o(n—5)—o(n—7)+::-, (25.24) 


where if n = mone) | then o (0) = n. Here the numbers 1,2,5,7... in (25.24) are the 
pentagonal numbers. In his proof, here given in modernized and more brief notation, 
Euler first took the logarithmic derivative of 


Lens = [TJ] a-x” (25.25) 
n=0 m=1 


to obtain 


= n(3n $1) nGnx = nanzl) \ mx" 
yo (-1)" aa ee - (Sens | ye jon 29) 


n=1 m=1 


He noted that the last sum on the right-hand side of (25.26) could be written as 


(oe) (oe) CO 
So omx™L ex $m te y= So mye xm. (25.27) 


m=1 m=l k=1 


He next observed that the coefficient of x” would contain an m for each mk = n. 
This meant that when the order of summation in (25.27) was changed, the coefficient 
of x” would be given by)>,,,, m = a(n). Therefore 


m|n 


(e“e) n (e“e) 
mx i 

) ——— = ) o(n)x", 
1-—x™ 

m=1 n=1 


and (25.26) could be rewritten as 


eo 3n#+l 
(Scop 
n=0 


a 


SS o(n)x" S> (—-1)” mons » ores, =0. (25.28) 


n=1 m=1 


Euler then multiplied the two sums in (25.28) and equated coefficients to obtain 


o(n) —o(n— 1) —o(n—2)+0(n—5)+0(n—-7)—---=0, (25.29) 


where the last nonzero term on the right-hand side of (25.29) wold be +n if n 
were a pentagonal number. This proved the formula, on the assumption that (25.25) 
was true. 
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Euler’s proof of the pentagonal number theorem is elementary and employs simple 
algebra in an ingenious way; we present it almost exactly as it appeared in Euler’s June 
1750 letter to Goldbach.”* He began with the algebraic identity mentioned earlier 


(d—-—ad— Bs) —d4)d — vy) ete. 
=== Be) sla —) 7d el pid) 
From this he had 


Q—-xd—xX701—-x)1—-x)\1—x etc. =S 


=1-x-x7(1-—x) -—2°(1 —x)1 — x2) — x41 — x) 1 — x2) — x3) — ete. 


He set S = 1 — x — Axx, where 


A=1—x4+x(1 —x)Q1 — x2) +2701 — x) — x2)(1 — x3) + ete. 


Multiplying out by the factor 1 — x in each term, he obtained 


A=1-x 9 (Pa 77) x3(1 —x?)(1 — x3) — ete. 
bx(1 — x7) +x2(1 — x7) — x9) +91 — x2) — x9) — x4) + ete. 
=1- x3 =e (1327) x7(1 —x’)(1 — x3) — ete. 


He set A = 1 — x? — Bx, where 


B=1—x7? 42701 — x2) — x9) +270 — x2) — x9) — x4) $+ ete. 


After multiplying out by the factor 1 — x”, appearing in each term of B, he arrived at 


B=1-x? — x4 — x3) x®(1 — x3)(1 — x4) — ete. 
+ x72(1 — x9) + x40 — x9) — x4) + x91 — x9) — x1 — x?) + ete. 
=1-x° a(x) xa —x3)\d — x4) — ete. 


Euler then set B = 1 — x° — x8C, where C = 1 — x2 +.x3°(1 — x9)(1 — x4) 4 
x®(1 — x3)(1 — x4)(1 — x°)+ etc. Multiplying out by 1 — x°, 


C=1-x3 ee gd eee a x°(1 —x*)(1 — x°) — ete. 
+ x3(1 — x4) +2901 — x41 — 2?) 42°C — x41 — 2°) — x) + ete. 
=1— x’ —x'"(a1 — x4) xP —x*)\(1 — x) — ete. 


When this process was continued, he got 


C=1-x'’-x"p, D=1-x9-x*E, E=1-x!!—<x!'F, 


24 Fuss (1968) vol. 1, pp. 522-524. 
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This completed Euler’s proof. To describe it more succinctly, write S = Pp, A= P), 
B = P», C = P3 and so on. If he had completed the inductive step, Euler would have 
shown that 


Pap Stag Se, (25.30) 
where 
[o,e) 
P,= eer HA a3 ys aa, (25.31) 
k=0 


Since Euler’s method of proving (25.30) is useful in establishing other identities in 
q-series, we describe it in the general situation. The first step is to break up each term 
of P,, into two parts 


x = xt) -(- xntky = hay, eg@hly ere chee xntky. 


in the second step, take the second (negative) part and add it to the first part of the next 
term of P,,: 


—x kt Dn cy = gery oan al = ghey 4+ xk+In = getty fod. (I i xn tkt+ly 


It can now be seen that 


oe) 
P,, = 1 _— 2n 1 — xen ep ee tae —_ Fae 4 3e el ts grtitey 


k=0 


-]1 _— x2 1 — x3" ao 


proving (25.30) by induction. Euler’s method was used by Gauss, and then in 
1884 Cayley applied it to prove an interesting identity of Sylvester, as discussed in 
Chapter 26. Rogers and Ramanujan independently employed the idea to prove the 
Rogers—Ramanujan identities. Recently, Andrews has further developed this method. 

A repeated application of (25.30) converts the infinite product in (25.23) to the 
required sum: 


(l—x)(1 —x%d — x4) —x4)--- 


Ee ee ee oe 0 ee od oe ces ea ce ee car ee 9 ae 


+ (Sher ead _ gat) + lyre TG. _ gt) + — 


= 1a $2) 4S $07)— OP 4 4 FD (RE EP) 4 
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As Euler noted, this series can also be written as 


[e) 
n(3n—1) 
a 26 xb x? x? x! x x2 bei teree SS Clee ; 


n=—-C 


This result of Euler was quite remarkable and its proof ingenious; it made quite an 
impression on the young Gauss who continued Euler’s work in new directions. 


25.5 Gauss: Triangular and Square Numbers Theorem 


We saw Euler’s algebraic virtuosity in his proof of the pentagonal number theorem. 
It is therefore interesting to see Gauss’s extremely skillful performance in his similar 
evaluation of series with triangular and square numbers as exponents. Gauss too 
divided each term of an appropriate series into two parts and added the second part of 
each term to the first part of the next term. These formulas for triangular and square 
exponents are particular cases of the triple product identity. Gauss knew this, but he 
was sufficiently fond of ingenious calculations to make a brief note of his method 
in a paper published only in his collected works, “Zur Theorie der transscendenten 
Functionen Gehérig.””> He gave a proof of the formula 


x4 


: t 
. - etc. 
1+ x4 


xx l-x 


x 1 
x Ll+xx 1+x 


=1—2x4+2x4 — 2x9 + 2x! — ete. for |x| < 1. 


Gauss started with the series: 


fd | 2 kn . CS 27 ta Oe eat a eh 


ae RG ee ae ere iceaenh): 7 
x” xan i ntl xan je ntl t= xn t2 

Q = f - } - ‘ etc., 
Lat Dea? Tage ae ee el © De gee 


He evaluated R in two different ways. First, he subtracted the kth term in Q from 
the kth term in P for each k to get 


1 s xin (1 =x") — x" tl). — x I) 


Ra Possess oa ib cams de ae 
1+x" Lox? (txt + x42)... (1 + xnt) 


(25.32) 
k=1 


He denoted this series for R by @(x,n). To find another series for R, Gauss 
subtracted the kth term in Q from the (k + 1)th term in P for each k to get 


25 Gauss (1863-1927) vol. 3, pp. 437-439. 
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x2ntl ent 1— ntl x 2nt3 1— ntl l—x 


R=1 . . . etc. 
Ltuttl payrtl fo yrt2  foynrtl [4 yrt2 [4 yrt3 


He concluded from this relation that 
R=1—x7"*!. g@,n4+1) 
or 
$(xn) = 1-27"). oan + 1). 


Gauss noted that the relation was true for n > 1, and for suchn 


o(x,n) =1 ntl int +9 Sn 16 etc. 


Note that the series P, Q, and R with n > | are absolutely convergent and that 
the terms can be rearranged. When n = 0, the series for P and Q are divergent and 
care must be exercised. It is clear from the definition of @(x,n) given by (25.32) 
that @(x,0) = 7 For clarity, we now employ notation not used by Gauss. Let 
P1, P2; P3,--- and qi, q2,q3, ... denote the consecutive terms of P and Q whenn = 0. 
Then 


O(x,0) = lim (pi qi) + (p2 — q2) +--+ + (Pm — Gn)) 
= lm (pi + (p2 — 41) + (p3 — G2) + +++ + (Dn qm—1)) — lim dm: 


Gauss denoted the second limit, limo gm, by T and called it the last term of 
the series Q, with n = 0. He observed that the first limit could be expressed as 1 — 
x(x, 1). Thus, he had T = 1 — xf(x,1) — (4,0) or 


6, 021 70) = TS Sea Se See, 


From the definition of 7, Gauss could see that 


-Dhes l—-xx l-x 
et ae ge Aiea Te 8 


and since $(x,0) = 5. Gauss finally had 
1 1 1—x3 
T= ay HM = Fe Si] 2x + 2x4 2x? + 2x16 
l+x lt+xx 14x3 


He gave an abbreviated form of the argument for the series with triangular numbers. 
We reproduce Gauss’s calculation exactly: 


1 — x2n+2 xt. ] — x2nt4 . |] _ ynt2 


P= 


Poxntl © pa yntl — xn+3 
x2n 1 — x2nt+6 ~] — xht2 4 — xn+4 


| | 
LaxMtl. fj —xnt3. | — xnts nee 
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xt. | — xnt2 2h | — tt? 1 _ ynt4 
es | 
Qi= 1 —xntl : J — xntl.y — xn+3 
ot oy — ht2 1 — ynt4 yy — ynt6 
+ etc. 
L—xrtl. | — xnt3 . | — ynts , 
R 1— x" x”. Pox. 1 —xnt2 x28 J xh. —xt2 1 —xnt4 es 
= etc. 
PS [oxrtl Tyo yntl yy — ynt3 TP yet yt gts ‘ 


where R, was obtained by subtracting Q; termwise from P;. Gauss denoted this series 
for Ry as w(x,n). Then, by subtracting the kth term of Q; from the (k + 1)th term of 
P,, Gauss had 


: x 2nt3 ~] — xnt2 x 2nt3 , xnt2 a xnt2 .] — xnt4 


=< n+ | | 
Ri =1+x 1 — x73 : 1 — xnt3. 1 — xnt5 + etc. 


= Lex" tl 4 x7 PF b(x,n +2) = wx,n), 


when n > 1. Therefore, 


w(x,n) = 1 xntl 1 2nt3 3nt6 xin lo etc. 


In the case n = 0, w(x, 0) = 0. Moreover, 


Pee ae = ; etc. 
x 


Hence, the required result followed: 


25.6 Gauss Polynomials and Gauss Sums 


In his paper of 1808, republished in 1811,7° Gauss defined the g-extension of a 
binomial coefficient by 


d-@™a— g@-Ha = gm?) adin(q Ss gm-H+1) 


(m, ) = (25.33) 
Cag Gag ee gh) 
He noted the easily verified formula 
(mu +1) =(m—1Lutl+q™ "lm —1,p). (25.34) 


Note that it follows from this that (m, j) is a polynomial in g when m is a positive 
integer. These polynomials are now called Gaussian polynomials and are extensions 


26 Gauss (1981) pp. 463-495. 
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of the binomial coefficients (") . We remark that we are using the familiar symbol q, 


although Gauss used x. 
Let us now see how Gauss evaluated the polynomial ae (m,w)x" for x = —1 
and x = ,/q. For x = —1, Gauss used (25.34) to show that 


f(q,m) = 1 — (m,1) + (m,2) — (m,3) + (m,4) 
satisfied the functional relation 
f(q.m) = (1—q™"") f q,m — 2). (25.35) 


Since f(g,0) = 1 and f(q, 1) = 0, he deduced that 


f(q,m) =(1—q)(—q?)---d—q™!), formeven, 
= 0, for m odd. (25.36) 


For x = ,/q, Gauss wrote 


F(qm) = 1+42(m,1) +. 40,2) + q2(m,3) + «= 
Soo at GaSe 2 GR Ote 2 GRA eae, (25.37) 


Note that the second (finite) series is identical to the first one, but is in reverse 
i fa : +1 
order. Gauss then multiplied the second series by q?, and added the result to the 
first series, yielding 


+1 


(1+q°2 )F(q,m) 


1+ g2(m, 1) +. q(m,2) + 42 0n,3) ++ 

+g) -q™ +q-q""(m,1) +g? gq” (m2) ++ 
1+ g2(q™ + (m,1)) + 4((m,2) +q"~!Gn, 1) 

+ g2((m,3) +q"(m,2)) +o. 


+1 


By (25.34), he concluded that (1+q " )F(q, m) = F(q,m+1); since F(qg,0) = 1, 
he had the required result 


F(q.m)=(1+q2)( +g +q2):-- (+43). (25.38) 


Gauss used formulas (25.36) and (25.38) to show that 


ee it (25.39) 


We note that the expression on the left side is called a quadratic Gauss sum. He 
proved this formula in four separate exhaustive cases: forn = 0,1,2,3 (mod 4). Gauss 
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1 


explained how to convert the expression F (q,m) into a Gauss sum. He set g2 = —y~! 
so that 
t= —2m _ —2m 1— —2m+2 
BG mato (25.40) 
Ly ep lyr) 
He then set m = n — 1| and took y to be a primitive root of y” — 1 = 0, to get 
= yon ie y? : i eae gree F P= af er é 
=) — = — y ; = = y ; =73 — y eee 
l-y 1-y l-y l-y 
Thus, he found 
F(y?,m)=1+y!-yty?-y-yity?-y?- yt yet 
=1+ytytt ype pyr (25.41) 


Observe that for y = en the expression (25.41) was the Gauss sum. From (25.38) 
and (25.41), it followed that 


py te py Say d+ yd =y 9) ty", 
(25.42) 


when y was a primitive root of y” — 1 = 0. Gauss showed that when y = on, the 
product in (25.42) reduced to the expression on the right-hand side of (25.39). The 
case n = 4s + 2 is elementary. In fact, for this case he observed that for any primitive 
root y, y*°+! = —1, so that yOst)? = —1. Moreover, for any integer f, 
2 5 2 : 2 2 
yOstlty? _ yQs+1)P+4s+2)H? yt 
Therefore, by cancellation of terms, Gauss found the sum (25.41) to be zero. 
Turning to the case n = 4s, he applied (25.42) to evaluate (25.39). Now yOst? = yt? 
and hence 


—1)2 _1)2 
Ltytyt tee fy D = 20 + yt yt Hee yO, (25.43) 


By taking m = 5n — 1 = 2s — 1 in (25.40) and using the calculations leading up 
to (25.42), Gauss had 


2, 
(25.44) 


Then y*® = —1, and hence 1 + y~7* = —y?8—?k(] — y~?8+2k) He applied this to the 
product in (25.44), so that by (25.43) 


(n-1)? 


F=l+ytyt+---t+y 
“34 yy" 8G ae ea): (25.45) 
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Again, from the fact that 1 — y~* = —y-*(1 — y~45+*), Gauss got 


ad — y Had = y 7) = yor) sees y 2st) 


— (es ly 2s2 t cal —y 2s Ie = > aeons (al = ce) eee d = yor, 
(25.46) 


Therefore, by (25.45) and (25.46) 
F = 2(-1)3)-2y-"'1 7a aes ae = Soot) oe él — yee: (25.47) 


Next, Gauss took the product of (25.45) and (25.47) and multiplied by 1 — y~*° to 
obtain 


(1 y)F? = 4(-1) ya —y Hd —y 4) dy), 25.48) 


So Gauss could conclude that F? = 2y'n = +2in, since y* = —1 or y’ = +i. 
Note that he also made use of the fact that the product 


ad-y)d-y%---a-y “st 


was equal to n, because y~!,..., y~4°+! were all the nontrivial nth roots of unity. By 


taking square roots, he obtained 


PET Sy iy Bee Sen: (25.49) 


To determine the sign when y = oa Gauss set y = p? in (25.44) and used 
a 


p" = —1 to get 
F=20+ p™)(1 + p*)0 + p*)0 + p*) +--+ pL +”). 
He rewrote this equation as 


P=2l+ pp 9dtp D0 pds pds) 


and observed that 1 + pt** = 2p+* cos(*2), finally concluding that 


aa 20 30 (2s — 1)x 
F = 2” p’ cos — cos — cos — - -- cos ————. 
n n n n 
Now p* = cos | + isin 7 = wa and since all the cosine values were positive, Gauss 


determined that the sign in (25.49) was positive. This concluded his proof of the case 
n=A4s. 

The other two cases of (25.39), where n is odd, orn = 4s + 1 or 4s + 3, are the 
most important because they lead to the proof of the quadratic reciprocity theorem. For 
these cases, Gauss first gave a detailed derivation using (25.36), although he indicated 
that (25.38) could also serve the purpose. So it remained for Gauss to prove that the 
Gauss sum in (25.49) was equal to ./n when n was of the form 4m + 1, and equal to 
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i,/n when n took the form 4m + 3. He took n = s + 1 where s was even, x was a 
primitive root of x” — 1 = 0, and g = x~*. Then 


(=gd P= 726 fog 2e-1) 1 — x2s+2 


= = = = 204) 
1—git! 1 — x-2/-2 1 —x-2i-2 1 —x-2i-2 
and the Gaussian polynomial was given by 
(s,k) = (2 eG) = (=1)5 FF ® 9, 
Using this in (25.36), he had 
n 
me = el _ x) SS es el Sy ee) 
k=1 
= x40“ (x SNe mae Cae Gee hy. (25.50) 


Since x was an nth root of unity and since 


1 2, ae! 2 \ 2 
gm — WP +k =I) = Fin? — 2n + 2k — 1”), 


x FIP +kkD) = x Fn Ok- DP = xe? where n+ | = 2e. 


Thus Gauss could rewrite (25.50) as 


n—-1 
W= yar = (x — xT) (03 — 973) oo (2 = ), (25.51) 
k=0 


We note that Gauss also worked out a derivation of this formula using (25.38). Now 
xn—2 _ y—M—2) = _ (x? — x7?) ete. implied that 


W = (-1)F (2 — x72) 04 — 4) et ty, (25.52) 
By multiplying (25.51) and (25.52), he obtained 
W2 = (—1)" (x — x) (x? — e793 = x3) et thy, 


When n was of the form 4s + 1, the factor (—1) ty became +1 and when n was of 
the form 4s + 3, it became —1. Thus, he arrived at 


W? = ex BOD 2) 4) 2), 
Using an argument similar to the one for (25.49), Gauss concluded that W = +n, 


where the + sign applied to n = 4s + 1 and the — sign to 4s + 3. Note that Gauss had 
already arrived at this point in 1801, but by a different route. The problem remaining 
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in 1805 was to choose the correct sign for the square root to obtain W. To find that, 
Gauss set x =e in (25.51) to get 
vant , 2m , 60 _ 24n—2)n 
W = (2i) 2 sin — sin — ---sin —————. 
n n n 

Whether n = 4s + 1 orn = 4s + 3, Gauss saw that there were clearly s negative 
factors in the sine product. Thus, he could conclude that W = ./n for n = 4s + 1 and 
W =i,/n forn = 4s +4 3. 


25.7 Gauss’s g-Binomial Theorem and the Triple Product Identity 


Gauss wrote a paper “Hundert Theoreme tiber die neuen Transcendenten,” but he did 
not publish it. In this paper, he derived a form of the terminating g-binomial theorem. 
He then wrote the result in a symmetric form and by an ingenious argument derived 
the triple product identity.7” We follow Gauss’s notation and proof: He stated the 
terminating g-binomial theorem in the form 


a” —1 a" —1-a"—a a" —1-a"—a-a"—aa , 
14 t 4 tt 4 t’ + etc. 
a-—1 a—l-aa—1 a—l-aa—1-a—-1 
= (1490 +at)( 4 aat)---(+a""'0). (25.53) 


Recall that we would write q instead of a. To prove the formula inductively, he 
denoted the sum as T and multiplied it by (1 + a”t) to obtain a series of the same 
form with n changed to n + 1. The reader may work out this calculation. Gauss next 
observed that by taking T = 0(n), one could see that T(1 + a"t) = O(n + 1). Thus, 
the terminating g-binomial theorem was proved inductively. 

To prove the triple-product identity, he wrote his result in a symmetric form. He 


~1 ’ 
took n even, set y = a’? tand x? =a to transform (25.53) into 


ee VS AOR «TRICE a 2 1 
. “X —=— 
[—xnt2* a y "Ya xnt2 | — xnt4 i yy 


ke Wate Bae: ging. 
Poxetd [oye Toye * Fy] to 
2 l—xx 1—x* 1—x® 1 — x” 
es ec ee 


(Lt xy) +23 y)-- tay) (1 2 *) 
3 n—-1 

(1+ ~) des (1+ al ). (25.54) 
y y 


27 Gauss (1863-1927) vol. 3, pp. 461-464. 
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Next, he took |x| < 1, so that x” + 0 asn — oo. The result was the triple product 
identity: 


l+xiyty )+x4oyty +P O% + y% 4+: 
= (1—xx)d — x4) — x)... t+ xy) + 3 y)(1 + 2° y) 
Ue NC ey a pee (25.55) 


Now let us examine the algebraic steps Gauss used to get the required symmetric 
form (25.54). Note that the last term in the left-hand side of (25.53) was ar : = 
y”. Combining the first and last terms of the sum, then the second to the last but one, 


and so on, he arrived at 


1—q’ l—-a@’ 1-— n-1 
l+y"+—— (1 | yr?) 4 <.* _ att y 4 

l-a l-a l—aa 

pany n a n—-1 =2 tn42 1f1 1 

l-a l-—aa 1 g2h-! 

<2 nl _ ,n-l — Axntl 

l—a”" l-a 1—a?2 ah da-An—1,4n 

l-a l—aa 1— a2" 


x” 1 — x" Pgs? 


1- Z x 
a(t bra yae HOT ‘4 ae Go” OF by) 


1 — x" 1—x"-2 1 —x"-4 9,3 3 
To 3M Toye Toye 7% 0 a aa | 


where A could be written as 


Pa—xtt2 ye ynt4 1 ynt6 1 — x2" y2 


1—xx 1— x4 1— x6 L—x"  ,gnn- 


He rewrote the product (1 + 4)(1+ at)(1 + aat)---( +a"—|t) as 


(1 7) (1 aa) (1 ~)a ae er ee ae a 


To complete the calculations necessary for the symmetric form (25.54), it was 
sufficient for Gauss to observe that the first half of the product could be rewritten as 


1 

xn n—1 n—-3 

2 (1+* )Q+ Jo-(e4). 
yan y y y 
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It is interesting to note that the triple product formula (25.55) contains a plethora 
of important special rita Euler’s pentagonal numbers identity follows on taking 
x= = q3 and y = = -q). Gauss’s formula for triangular numbers, derived earlier, 
follows by taking x = q > and y= q?. 

It is surprising that Gauss did not publish his work related to the triple product. 
Gauss’s 1808 paper correctly noted the significance of the Gaussian polynomial (m, jx) 
and later work of O. Rodrigues, P. MacMahon, and others revealed the combinatorial 
import of the Gaussian polynomial and its generalization. In addition, Gaussian 
polynomials played an important role in Cayley and Sylvester’s development of 
invariant theory. It remained for Jacobi to rediscover the triple product formula and 
use it in his theory of elliptic functions. 


25.8 Jacobi: Triple Product Identity 


In his work in the theory of elliptic functions, Jacobi encountered numerous infinite 
products, a large number of which were particular cases of the product side of the 
triple-product identity. And the product side of this identity was composed of two 
infinite products, first elucidated by Euler, of the form (25.1). Jacobi gave two proofs 
oof the triple product identity in his Fundamenta Nova; we present the second.78 
Because Jacobi wished to convert his products in elliptic function theory into series, it 
was only natural for him to start with Euler’s formula (25.3). Change q to q? and x to 
= to get 

q 


(1+ gz) +9?2)0 +2°z)1 +472): 


4,2 9-3 


1-g@ d-@d-q)) d-@d—-q)a—4q) - 


Jacobi then multiplied this equation by one in which z was replaced by i, to obtain 


3 5 
(gad +ead +a9---(1 \(1 ee )( ee Je 
z &j z 


Sih, ME 5 qt ga — 
"1g d-@d—-q) ° d—-9q)d—-q)d —q) | 
x ( i q 1 i q* 1 q ! i ) 
"T-@ 2 d-qd-q) 2° d=-q)d—-qhd —¢@ 23 | : 


28 Jacobi (1969) vol. 1, pp. 232-234. 
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Jacobi observed that the coefficient of z” + = in the product on the right-hand 
side was 


nn 


q 
qd = gq?) — q*) Seg (dl os q") 
x (1 | Ge ge a . q” 
T 1- q 1- qent2 ; d— gyda _ q‘*) = g2t2)\(1 = qznt4) 
Ge gq” 
‘(1 — 92) —q) —q®) re) — g2nt2)(1 — g2"t4)(1 — g2*6) | -) : 
(25.56) 


It seems that Jacobi had some trouble simplifying this expression and this delayed 
him for quite a while. But he succeeded in resolving the problem by proving that 


ee) 2 
q" gh 


La apd 90 go aD 


n=1 


[Ja-¢@'s'= (25.57) 
#=1 


He replaced g by q? and then set z = gq” to sum the series in (25.56). He thus 
found the coefficient of z” + a to be 


nn 


q 
(= @?)- (= 4") 


This proved the triple product identity. To prove (25.57), Jacobi assumed that the 
product on the left-hand side could be expressed as a sum of terms of the form 


Anz" 
((1 — qz)---(l—"z)) 


For A”, he applied the standard procedure of changing z to gz to get a functional 
relation. Obviously, the difficult point here was to conceive of that form of the series 
in which the variable z would also appear in the denominator. Neither Euler nor Gauss 
came up with such a series. 

Jacobi’s formula (25.57) is very interesting. Note that the product on the left-hand 
is the same as the product in Euler’s second formula (25.20) but the series on the right, 
though similar in appearance, has an additional factor in the denominator of each term 
of the sum. Jacobi may have asked whether it was possible to directly transform one 
series into the other. This suggests a transformation theory of g-series similar to that 
for hypergeometric series. Heinrich Eduard Heine (1821-1881) paved the way for the 
study of transformations of qg-series in his 1846 theory of the q-hypergeometric series. 
We also note that in 1843, Cauchy gave a generalization of (25.57). 
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25.9 Eisenstein: g-Binomial Theorem 


Eisenstein’s “Neuer Beweis und Verallgemeinerung des binomischen Lehrsatzes”*? 


was one of three papers he submitted to Crelle’s Journal in May 1844. In this paper, 
he proved the general g-binomial theorem, although Eisenstein wrote p instead of q, 
and deduced from it the ordinary binomial theorem. His proof was based on an idea of 
Euler and employed the multiplication of series. Eisenstein did not refer to Euler, but 
mentioned Dirichlet and Martin Ohm, who may have discussed Euler’s idea in their 
lectures. Eisenstein first proved the finite case of the g-binomial theorem. For this he 
defined for a positive integer a, 


b(x,a) = (1+ x) + qx) t+ qx): +q%!x). (25.58) 


He proved Rothe’s formula without reference to Rothe: 
Qa 
o(x,a) = > Ars’, (25.59) 
t=0 


where 


Ai . ee gueD, (25.60) 


Note that this was done in the standard way by using the relation 
(1 + q*x)b(x,@) = (1 + x)$ (qx, a). 


Eisenstein stated the general g-binomial theorem in the form 


(l+x)(1+qx)(1+q?x)--- 
(1+ g@x)( + q@t!x)(l + q¢t2x)---’ 


o(x.a) = 0 Axt = (25.61) 
t=0 


where |qg| < 1, and @ was any number. To prove this, he first wished to show that 


b(x,a + B) = b(x,a)$(q"x, B). (25.62) 


For this purpose, he demonstrated that 
Cr = Ar + Ari Big® + Ay2Bag™™ +++ + Big", (25.63) 


where B; and C; were obtained from (25.60) by replacing a by 6 and a by a + B, 
respectively. He noted that (25.62) and (25.63) were clearly true when aw and f were 
positive integers. Eisenstein then set u = g® and v = q? and observed that both 
sides of (25.63) were equal for infinitely many values of u and v, and thus (25.63) was 
identically true. At this point, Eisenstein noted that the proof could be completed in 
the usual manner and referred to Dirichlet and Ohm. From Chapter 4, one may see 


29 Hisenstein (1975) pp. 117-121. 
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that Eisenstein intended to use (25.62) to prove (25.61) for all integers w, and then for 
all rational numbers, and finally (by continuity) for all real a. 


25.10 Jacobi’s g-Series Identity 


In 1846, Jacobi proved the g-binomial theorem*° and obtained an extension of the 
Vandermonde identity, as well as an extension of Gauss’s 2 F; summation formula. 
Recall Gauss’s summation formula: 


oe) 


s ()n(b)n — Vo) (c—a—b) 
ni(c)n  T(e—a)P(c—b) 


n=0 


when Re(c — a — b) > O. Note that when a = —m, a negative integer, we have 
Vandermonde’s identity 


io Os. (25.64) 


3 (=m)n(b)n — (C=) 
n=0 
This identity is not difficult and follows immediately from the Gregory—Newton 
interpolation formula; it can also be obtained by multiplying two binomial series and 
equating coefficients. In about 1975,>! while studying Needham’s excellent work,°* 
Richard Askey saw that around 1301, the Chinese mathematician Chu Shih-Chieh 
(also Zhu Shijie) discovered two equations; when they are combined, they yield the 
Vandermonde identity, found by Vandermonde in 1772.** Later, when Chu Shih- 
Chieh’s work was fully translated,*+ Askey and I studied it very carefully and found 
only one of the required equations. Thus, denoting Vandermonde’s identity as the 
Chu—Vandermonde identity might possibly be an exaggeration. 
In Jacobi’s notation, the g-binomial theorem was stated as 


ee = (v—w)(v—xw) a (v — w)(v — xw)(v — x?w) bi 


1—-x (1 —x)( — x?) (1 — x). — x2)(1 — x3) 


_ (= wz) — xwz)(1 — x*wz)(1 — x4wz)--- 


(1 — vz) — xvz)d — x2vz)(1 —x3uz)-- | (25.65) 


Let $(z) denote the product. In his proof, Jacobi assumed that 


b(z) = 14+ Ayz + Azz” + A3z> + Agz* +e: 


30 Jacobi (1969) vol. 6, pp. 163-173. 
31 Askey (1975) pp. 59-60. 

32 Needham (1959) p. 138. 

33 Vandermonde (1772). 

34 Hoe (2007). 
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and observed that (z) satisfied the functional relation 
b(Z) — b(xz) = vh(z) — wh(xz). 
Thus, the coefficients A;, Az, A3,... satisfied the equations 
(—x)Ay =v—w, (l—x7)A2 = (v—xw)Aq, (1 —x*)A3 = (v — x7) Ad, - 


By induction, this gives the desired result. 

Now note that from the product expression for [w,v] it is easy to see that 
[w, v][v, 1] = [w,1]. If the corresponding series are substituted in this equation and 
the coefficient of z? equated on the two sides, then the result is a g-extension of the 
Vandermonde identity. Jacobi wrote that he saw this result in Schweins’s Analysis:*> 


(1 —w)(1 —xw)(1 —x?w)--- (1 — x?! w) 
(1 — x)(1 — x?)(1 — x3)--- (1 — x?) 


Ss (v — w)(v — xw)---(v —x*!w) (=v —x0)---d — xP—k-1y) 
(1—x)(1 —x2)---(1 —x*) (1 —x)(1 — x2)--- (1 — xP) 
(25.66) 


Note that the empty products occurring in the sum have the value 1. Now, it is 
possible to prove Gauss’s formula from the Vandermonde or Chu—Vandermonde 
identity, but it is not easy, and such a proof was not known in Jacobi’s time. But in a 
beautiful argument, Jacobi used (25.66), the g-extension of Vandermonde, to prove a 
q-extension of Gauss’s formula. He divided both sides of the equation by the first term 
on the right-hand side, to get (after a change of variables) 


=a L=sw dS 7) ae 
d—r(l —xr)(1 — x2r)--- (1 —xP-!r) 


Pp k-1 
or ,(u—r)u—xr)---(u—x*'r) 
=14 2 1) dpa sda (25.67) 
= -k 
td | ce a er ae ie (25.68) 
(l—r)(1 —xr)--- (1 —xk-!r) 
Jacobi stated the extension of Gauss’s formula in the form 
(—s)(Q —f) se (_—s)(x—s)U —t)(x -1) 2 
'(d—-x)d—r) ° @—x)d—x2)0—rn) — xr) 
__ G=-s@=-s)@*-s)d-N@-N@?=-) 3 
"@=H)0=290 =) 0 =pNd=2nd=2y) 
= ea aa Se 2 ee ee 
7 dd — sr) — tr) ( —xsr)( — xtr) (1 — x*sr)(1 — x*tr) vee, (25,69) 


r d-rd—-str) Q-xr)d—-xstr) Gd- x2r)(1 — x2str) 


35 Schweins (1820). 
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He showed that when t = x?, and p = 0,1,2,..., this was reduced to the identity 
(25.68). Thus, (25.69) was true for an infinite number of values of t and, by symmetry, 
for an infinite number of values of s. He then observed that 


(1 —sr)(1 — tr) _ 


=l+tertortor+-:- 
d—nd —str) testes We? NG 


where c1, C2, C3,... were polynomials in s and t. This implied that the product on the 
right-hand side of (25.69) was of the form 


(lteirtear?+egr3 +--+) ter tenx?r? +--+) (tern tooxty? t.. +) -- 
This product would then be of the form 1 + bir 4 bor? 4 b3r3 + ++., Where 
b1, bz, b3,... were polynomials in s and t. To complete the proof, Jacobi wrote the 
left-hand side of (25.69) in powers of r as 1 + kyr + kor? +--+, so that ky, ko,... 
were polynomials in s and t. Jacobi concluded that b; = k; because it held for an 
infinite number of values of s and t. This completed the proof of (25.69). Jacobi also 
observed, without giving a precise definition, that the products on the right-hand side 
of (25.69) could be considered q-analogs of the gamma functions in Gauss’s formula. 
Very soon after this, Heine obtained a nearly correct definition. 


25.11 Cauchy and Ramanujan: The Extension of the Triple Product 


In 1843, Augustin-Louis Cauchy published an important paper*® containing the first 
statement and proof of the general q-binomial theorem and an extension of the 
triple product identity. To be clear and succinct in stating the results of Cauchy and 
Ramanujan, we introduce the following modern notation: Let 


(a:q)n = (1 —a)(1 — aq)- ++ (1 —aq"""), for n> 1, 
=1, for n = 0, 

1 
~ (1—q-ta)(l — ga) (1 — qa)’ 


for n <0. 


And (a;g)o = I -—adl -— gal q’a) --- , Using this notation, the g-binomial 
theorem can be stated as 


oe) 


(4; Q)n ies (AX3)oo 
(G:Q)n (X3@)oo 


4#=0 


For convergence we require |x| < 1, |g| < 1. Cauchy’s extension of the triple product 
identity can now be stated for 0 < |bx| < 1: 


S14 \ nn AX D0 (434) 4, G5 Doo 
pe ee . (bX; 4) oo (24:4) . 


oe) 


36 Cauchy (1843a). 
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Here, Cauchy failed to find the better result, called the Ramanujan ;w, sum, 
generalizing the q-binomial theorem as well as the triple product identity. G. H. Hardy 
found this theorem without proof in Srinivasa Ramanujan’s (1887-1920) notebooks 
and published it in his 1940 lectures on Ramanujan’s work.*’ This formula provides 
the basis for the study of bilateral g-series. 

Ramanujan’s theorem can be stated as 


3 (Dn in _ (ax: q)oo(£34).. (4: q)oo(2:4) 5. 
n=—o0 (5m (x; Q)00(234) 00: Dol £9) 55 


where |g| < 1 and [2 | < |x| < 1. 


25.12 Rodrigues and MacMahon: Combinatorics 


Olinde Rodrigues and Percy Alexander MacMahon made important contributions to 
combinatorial problems connected with Gaussian polynomials and their generaliza- 
tions. Olinde Rodrigues (1794-1851) was a French mathematician whose ancestors 
most probably left Spain, fleeing the persecution of the Jews. He studied at the 
Lycée Impérial in Paris and then at the new Université de Paris. He published six 
mathematical papers during 1813-16, one of which contains his well-known formula 
for Legendre polynomials. He did not pursue an academic career, perhaps because of 
religious discrimination. In fact, he apparently gave up mathematical research for over 
two decades, returning to it in 1838; he then produced papers on combinatorics and an 
important work on rotations. 

Rodrigues’s theorem from 1839%° gave the generating function for the number of 
permutations Z(n,k) of n distinct objects with k inversions; this was the number of 
permutations a1, d2,...,dy, of 1,2,3,...,n with k pairs (a;,a;), such that i < j and 
a; > aj. The values of k range from 0 to nd) To find the generating function of 
Z(n,k), Rodrigues argued that Z(n,k) was the number of integer solutions of the 
equation 


where 0 < x; < i fori = 0,1,...,n — 1. This implied that the Z(n,k) was the 
coefficient of r* in the product 


Cepeda Ose ee oer ed eee eee, 
As immediate corollaries, Rodrigues had 
Z(n,0) + Z(n,1) +---+ Zanjn—1)=n!, 
Z(n,0) — Z(n,1) + Z(n,2) — +» + (-1)""!Z(a,n — 1) = 0. 


37 Hardy (1940) p. 222. 
38 Rodrigues (1839). 
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The first relation also answered a question posed by Stern on the sum of all the 
inversions in the permutations of n letters. Note that we can write Rodrigues’s result as 


n(n—1) 
2 


— — 2 er ae — n 
3 Z(n,kq* = (l= gd —¢)-- Gd =—¢ ) (25.70) 
a Clg)" 


Then, we see that the expression on the right-hand side is the g-extension of n!. 

In 1913, MacMahon found another important way of classifying permutations 
by defining the greater index of a permutation.*? For a permutation aj,q2,... dy of 
1,2,3, ..., MacMahon defined the greater index to be the sum ys, A(a;), Where 
(aj) = i if aj > aj41, and A(a;) = O otherwise. Let G(n,k) denote the number of 
permutations for which the greater index is equal to k. MacMahon proved that 


n(n—1) 
2: n 
Z l—g)1— (La 
y- Gin, bq" = (—q)d —q*)::-d- 4 ) (25.71) 
a (l—q)” 
This immediately gave him the result 
G(n,k) = Z(n,k). (25.72) 


In fact, MacMahon proved his theorems even more generally, for permutations of 
multisets. In a multiset, the elements need not be distinct. For example, 1"'!22...r’”" 
denotes a multiset with m,; ones, mz twos, and so on. The concepts of inver- 
sion and greater index can be extended in an obvious way to multisets. So if 
Z(m1,m2,...,m,-;k) and G(m1,m2,...m,;k) denote the number of permutations 
with k inversions and the number of permutations with greater index k, then 
MacMahon had 


S> Z(mi,m, enery ,m,k)q* = > Gomi, m2, ant .,my;k)gk 


= d-@d —q*)--( agrees) 
(l—q)---(—q™)(1—q)---d—q™)---(—q)::-d-qm™) 


Note that when r = 2, the expression on the right is the Gaussian polynomial 
(m1 +mz2,m}), in Gauss’s notation. Just as the Gaussian polynomial is the g-binomial 
coefficient, we can see that (25.73) is the g-multinomial coefficient. 

MacMahon (1854-1929) studied at the military academy at Woolwich. He became 
a lieutenant in 1872, captain in 1881, and major in 1889. He returned to Woolwich 
as an instructor in 1882. This teaching post, along with his friendship with the 
mathematician George Greenhill, set the scene for MacMahon to exercise his mathe- 
matical talents. Starting in the early 1880s, he contributed numerous important papers 
to the subject of combinatorics and related topics, including symmetric functions 
and invariants. He was also a fast arithmetical calculator and constructed a table of 
partitions of integers up through 200. By studying this table, Ramanujan was able to 


(25.73) 


39 MacMahon (1978) vol. 1, pp. 508-563. 
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discover the arithmetical properties of the partition function. MacMahon’s calculations 
played a crucial role in Ramanujan’s research, influential even today. 


25.13 Exercises 


(1) Prove Bernoulli’s formulas (25.17) and (25.18) for the probabilities to win. 
(2) Prove Euler’s first identity (25.19). 


(3) For (m,) defined by (25.33), prove Gauss’s formulas (25.34), (25.35), and 
(25.36). 


(4) Following F. H. Jackson, set [a] = a [n]! = [1][2]---[n], and 


Ca) = 1—q*x 
(l=) =? Toqktay ) 


k=1 
Show that 
ce 1, el, (ollat+ 1) 5 | (alle +o +21 5 
ae Naga — pans ak BI oy 
See Jackson (1910). 
(5) Let u/(x) = Au(x) = He) and A~!u(x) = f u(x)dgx. Show that 


(a) f(x)! (x)dgx = u(x)v(x) — fu(gx)v' (x)dgx. 
(b) @) Ad —x)"*) = [n+ 0 — qx)™. 
(ii) 


/ Age) age = =x)?" 


x 
In+1] 


an jee — gx) aa x. 


eae 1 1 = 
(ill) fy x1 gx) Pdgx = Beery fo 2 1 = gO dyx 


SN a yy: 
Dg(m +n + 2) 


(iv) fy 8-1 = gt) BDL = gtx) *dgt 


(-—q%)a- ) 
= B,(f, 1-4 
al 6» @—@d—q 


(d—q*)(l—q%*)d — g®)(l— qt!) 5 
(1 — qg)(1 — q?)(1 — g”) 1 — gvt!) 
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() ile ala eee 
(l—q)Q — 4”) 
Seat aa Cag ea. 
d= —4q) 4”) — 47th) 
za By(B,y — & — B) 
Bg (B,y — B) 
(d) es aa dat = a, provided / + m is an integer. See Jackson 
(1910). 


(6) Prove Cauchy’s formula 


) 
(AX; q)oo _ e (b — a)(bq — a) --- (bq" | — oak? x" 
(bx;4)oo “= (4: Qn (bx; Mn 


See Cauchy (1882-1974) vol. 8, series 1, pp. 42-50. 
(7) Prove Ramanujan’s quintuple product identity 


0° gq? 1 gun! 
H(x) = [Ja-q"a eg(1-4 —)a- eae H(1- 3 ) 


n=1 


n+l) 
nn 
} (x3” Se a eae aa 


n=—-C® 


One method of re is to assume H(x) = Ss, c(n)x”. Then compute 


H(qx) A HG ) 
A(x) 
This ferme oe discovered several times. It is Soasible that Weierstrass was 


aware of it, since it follows from a three-term relation for sigma functions, a 
part of elliptic functions theory, presented by Weierstrass in his lectures. This 
formula appears explicitly in a 1916 book on elliptic functions by R. Fricke. 
Again, Ramanujan found it around that same time and made extensive use of it. 
In this exercise, we name the formula after Ramanujan. For a detailed history 
of the formula and several proofs, see Cooper (2006). Also see the remarks in 
Berndt (1985-1998) Part III, p. 83. 
(8) Prove the septuple product identity of Farkas and Kra: 


(+x) —xT[a-4¢"’a- g'n(1 = ale = q"x i - t) 


n=1 


-Sc 1)"q 5n2 te (3 1)"q 5n2 Sn aD DG 1)"q Pi) 
yeas Ga (Se 1)"q 5n2 oni Ht 1)"q gaa) 
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This result generalizes the quintuple product identity. For a proof, see Farkas 
and Kra (2001) p. 271. 


25.14 Notes on the Literature 


A history of the quadratic Gauss sum is presented in Patterson (2007). For more papers 
on related topics, see Goldstein, Schappacher, and Schwermer (2007). Altmann and 
Ortiz (2005) gives interesting information about Rodrigues. For recent developments 
and proofs connected with the triple product identity, see Andrews (1986b) pp. 63-64, 
Foata and Han (2001) and Wilf (2001). 
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Partitions 


26.1 Preliminary Remarks 


Naudé’s problems posed to Euler marked the beginning of research in the theory 
of partitions, as discussed in Chapter 25. Euler then solved these problems using 
generating functions, employing the same idea to prove the following remarkable 
theorem:! 


The number of different ways a given number can be expressed as the sum of different whole 
numbers is the same as the number of ways in which that same number can be expressed as the 
sum of odd numbers, whether the same or different. 


For example, the number of ways 6 can be expressed as a sum of different whole 
numbers is four: 


To prove this in general, Euler gave the generating function for the number of 
partitions with distinct parts: 


d+gd+qd+q0d+q)d+q)d4+q%)::: 


Observe that 4 is the coefficient of g° in the power series expansion of this product, 
for q® can be obtained as g°,q>q,q*q*, and q3q7q. On the other hand, Euler noted 
that the generating function for odd parts was 

1 
gag) (lags 


= (ltgtg! +. 4g +e34.jd+g?+-:). 


! Euler (1988) pp. 275-276. 
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To prove the theorem, Euler showed that the generating functions and therefore the 
coefficients of their series expansions were identical: 


Lagh Iq) tage 19" 
ie l+q@)d+q)(1+q4)---= ast 
AI+gd+qdt+q)d+q) =o i= 17 fae 
1 1 1 
= : , cai (26.1) 


l1—g 1-g 1-@ 


We also noted in Chapter 25 that in 1741 Euler conjectured and eight years later 
proved the pentagonal number theorem 


[o,@) [o,@) 


T]q-@3= cpa. (26.2) 


n=1 n=—CO 


Though Euler did not give a combinatorial interpretation of this identity, A. M. 
Legendre found one and included it in the 1830 edition of his number theory book.” 
To understand Legendre’s interpretation, consider how q° would arise in the power 
series expansion of the infinite product 


(—¢)(—q”)(—q) = —¢°, (—4)(—9@>) = +9°, (-q*)(—¢*) = +9°, (—¢°) = 9°. 


When the partition of 6 contains an odd number of parts (e.g.,6 = 1 +2 + 3) 
then a corresponding —1 is contributed to the coefficient of g° in the series. When 
the number of parts is even, then +1 is contributed. Hence the coefficient of q° in the 
series is 0. Thus, if we denote by pe(), po(n) the number of partitions of n with an 
even/odd number of distinct parts, then Legendre’s theorem states that 


m(3m + 1) 

Pe(n) — po(n) = (-1)", when n= eee 
+1 

= 0, when nF monet d 


Before Euler, Leibniz conceived of the problem of partitioning a positive integer. 
In a brief letter of July 1699 to Johann Bernoulli,* Leibniz enquired whether he had 
considered the difficult problem of partitioning a given number into two, three, or 
more parts. According to Mahnke, in a manuscript dated September 1674, Leibniz 
gave examples of partitions: 3 into two parts as 2+ 1 and into three parts as 1+ 1+ 1; 
4 into two parts in two ways, 2 + 2 and 3 + 1, and into three parts in only one way, 
2+1+ 1. And in another manuscript, Leibniz pointed out the connection of partitions 
with symmetric functions: For example, the three partitions of 3, namely 3,2 + 1,1 + 
1 + 1, correspond to the symmetric functions 17 a*, 7 a7b, > abc.4 

After Euler, J. J. Sylvester (1814-1897) was the next mathematician to make 
major contributions to the theory of partitions. Sylvester entered St. John’s College, 


2 Legendre (1830) pp. 131-133. 
3 Leibniz and Bernoulli (1745) pp. 461-462 or Leibniz (1971) vol. 3, part 2, p. 601. 
4 Mahnke (1912-13) p. 37. Also see Knuth (2011) pp. 505-506. 
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Cambridge, in 1833 and came out as Second Wrangler in 1837. The great applied 
mathematician George Green was fourth. Sylvester, of Jewish heritage, was unwilling 
to sign the thirty-nine articles; consequently, he was unable to take a degree, to 
obtain a fellowship, or to compete for one of the Smith’s prizes. It was only in 1855 
that he received a professorship of mathematics at the Royal Military Academy at 
Woolwich. Unfortunately, in 1870 he was retired early from this position, when his 
mathematical creativity was at its peak. In 1875, when Johns Hopkins University 
was founded in Baltimore, Sylvester was elected the first professor of mathematics 
(1876-83). Sylvester enjoyed a happy and productive late career in Baltimore; he there 
founded the American Journal of Mathematics whose first volume appeared in 1878. 
Moreover, Sylvester very successfully trained a number of excellent mathematicians, 
inaugurating serious mathematical research in America. It is not surprising that 
many of these American mathematicians contributed to the theory of partitions, since 
research in that topic required abundant ingenuity but more limited background. 

Sylvester’s interest in partitions arose fairly early. In 1853, he published a paper on 
his friend Cayley’s quick method for determining the degree of a symmetric function 
expressed as a polynomial in elementary symmetric functions. For that purpose, 
Cayley had employed a result Sylvester attributed to Euler:> “To wit, that the number 
of ways of breaking up a number n into parts is the same, whether we impose the 
condition that the number of parts in any partitionment shall not exceed m, or that the 
magnitude of any one of the parts should not exceed m.” To understand this last result, 
consider that the generating function for the number of partitions of an integer into at 
most m parts, with each part < n, can be inductively demonstrated to be equal to the 
Gaussian polynomial 


d=¢g?™ ) lag" yd g™ 
(1 —q")( — q"-!)---—@) 


This polynomial remains unchanged when m and n are interchanged; hence follows 
the result used by Sylvester. The Gaussian polynomial also cropped up in the work of 
Cayley and Sylvester in invariant theory. As we see in Chapter 30, they related the 
coefficients of the polynomial to the number of independent seminvariants. Cayley 
and Sylvester took an interest in partitions as a result of their researches on invariants. 
Though they both contributed to partition theory, Sylvester made the subject his own 
domain by establishing fundamental ideas and producing new researchers, in the form 
of his students. 

A graphical proof of Euler’s theorem would start out by representing a given 
partition as a graph. For example, write the partition 5 + 2 + 1 of eight as 


5 Sylvester (1853b) p. 200. 
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and then enumerate by columns. Thus, one obtains the conjugate partition 3 + 2 + 
1+ 1+ 1 of eight. It is immediately clear that if we have a partition of an integer 
N into n parts of which the largest is m, then its conjugate is a partition of N into 
m parts of which the largest is n. This at once gives us the theorem: The number of 
partitions of any integer N into exactly n parts with the largest part m and the number 
of partitions of N into at most n parts with the largest part at most m both remain the 
same when m and n are interchanged. The proof of this theorem using the generating 
function method is less illuminating, illustrating the power of the graphical method. 
Sylvester remarked that he learned the technique from its originator, N. M. Ferrers. 
In a footnote to his paper, Sylvester wrote, “I learn from Mr Ferrers that this theorem 
was brought under his cognizance through a Cambridge examination paper set by Mr 
Adams of Neptune notability.”° Here Sylvester was referring to the astronomer John 
Couch Adams, discoverer of Neptune. 

It was within this very concrete graphical method that Sylvester and his American 
students, including Fabian Franklin, William Durfee, and Arthur Hathaway, made 
their original and important contributions to the theory of partitions. It is interesting to 
note that the other significant results obtained by American mathematicians at around 
the same time were in abstract algebra. At that time, this too was a topic requiring 
a minimal amount of background knowledge, unlike subjects such as the theory 
of abelian functions. Early American results in abstract algebra included Benjamin 
Peirce’s (1809-1880) paper on linear associative algebras dating from 1869, published 
posthumously in 1881 by his son Charles Saunders Peirce (1839-1914) in Sylvester’s 
new journal.’ B. Peirce introduced the important concepts of nilpotent and idempotent 
elements and the paper starts with his famous dictum “Mathematics is the science 
which draws necessary conclusions.” C. S. Peirce added an appendix to the paper, 
proving a significant theorem of his own on finite dimensional algebras over the real 
numbers. In modern language, the theorem states: The only division algebras algebraic 
over the real numbers are the fields of real and complex numbers and the division ring 
of quarternions. 

The German mathematician G. Frobenius (1849-1917) also discovered this theo- 
rem at about the same time as Peirce, though he published it in 1877.8 The Frobenius- 
Peirce theorem and Franklin’s beautiful proof of Euler’s pentagonal number theorem 
are the earliest major contributions by Americans to mathematics. We shall see 
details of Franklin’s work later in this chapter; concerning C. S. Peirce, we simply 
note that he made outstanding contributions to mathematical logic and to some 
aspects of philosophy. The systematic philosopher Justus Buchler, who edited Peirce’s 
philosophical writings, stated in the introduction, “Even to the most unsympathetic, 
Peirce’s thought cannot fail to convey something of lasting value. It has a peculiar 
property, like that of the Lernean hydra: discover a weak point, and two strong ones 
spring up beside it. Despite the elaborate architectonic planning of its creator, it is 
everywhere uncompleted, often distressingly so. There are many who have small 


© ibid. p. 201. 
7 Peirce (1881). 
8 Frobenius (1878). 
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regard for things uncompleted, and no doubt what they value is much to be valued. 
In his quest for magnificent array, in his design for a mighty temple that should house 
his ideas, Peirce failed. He succeeded only in advancing philosophy.”? 

After the researches of Sylvester and his young American students, P. A. 
MacMahon (1854-1929) dominated the topic of partitions. One of MacMahon’s 
results was connected with Ramanujan’s 1910 rediscovery of two identities, first 
found by Rogers in the 1890s!° during his work on q-series: 


oo m2 oo 

oS a a T] @-aty (1-4, (26.3) 
m=0 4>4)m m=0 

Co m(m+1) 00 
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In 1913, Srinivasa Ramanujan communicated these identities to G. H. Hardy, 
although by this time Rogers’s work was forgotten. Ramanujan had no proof; Hardy 
unsuccessfully sought a proof, showing the identities to his colleagues. MacMahon 
was among those who saw the formulas. An expert in symmetric functions, invariant 
theory, partitions, and combinatorics, he had known Sylvester and his work. Thus, it 
was natural that MacMahon conceived of an interpretation of the identities in terms 
of partitions. By expanding (1 — q>”"t!)~! and (1 — q*”"*4)—! as geometric series, 
the coefficient of gq” in the expression on the right-hand side of the first identity is 
clearly equivalent to the number of partitions of n into parts = 1 or 4 (mod 5). For the 
left-hand side, observe that 


m? = (2m — 1) + (2m —3) +---+54+341, 


or the sum of the first m odd parts. We can find a partition of n if n — m? is partitioned 
into at most m parts with the largest part added to 2m — 1, the next to 2m — 3 and so 
on. The parts in this partition of n differ by at least 2. Moreover, the partitions of n 
associated with a specific m are enumerated by 


m2 


q 
(1 — q)(1 — q?)---(1—q™)’ 


and the sum of these terms yields all the partitions of this form. We therefore have 
MacMahon’s theorem, presented in his 1915 Combinatory Analysis:'! The number of 
partitions of n in which the difference between any two parts is at least 2, equals the 
number of partitions of n into parts = 1 or 4 (mod 5). We note that in MacMahon’s own 
statement of the theorem, instead of specifying that the parts differ by at least 2, he 
wrote that there were neither repetitions nor sequences. In a similar way, the second 
identity states: The number of partitions of m in which the least part is > 2 and the 


9 Buchler (1955) p. Xvi. 
10 Rogers (1894) § 5. 
'l Macmahon (1915-16) vol. 2, pp. 32-36. 
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difference between any two parts is at least 2, is equal to the number of partitions 
of n into parts = 2 or 3 (mod 5). This arises out of the relation m(m + 1) = 2+4+ 
Ob - =F 2m, 

Several proofs of the Rogers—Ramanujan identities have been given and they have 
been generalized both combinatorially and analytically. Issai Schur independently 
discovered the Rogers—Ramanujan identities and their partition theoretic interpreta- 
tion; in 1917 he gave two proofs, one of which was combinatorial.!2 However, as 
Hardy wrote in 1940, it is only natural to seek an argument that sets up a one- 
to-one correspondence between the two sets of partitions. No such bijective proof 
was known in Hardy’s time, and it was not until 1981 that Adriano Garsia and 
Stephen Milne, working on the foundation established by Schur, published a proof of 
the MacMahon-Schur theorem, equivalent to the Rogers-Ramanujan identities.'> We 
note that Schur’s combinatorial proof also motivated Basil Gordon’s 1961 partition- 
theoretic generalization.'* See the exercises. 

Issai Schur (1875-1941) was born in Russia but studied at the University of Berlin 
under Georg Frobenius who had a great influence on him. Schur made fundamental 
contributions to representation theory, to the related theory of symmetric functions, 
and also to topics in analysis such as the theory of commutative differential operators. 
A great teacher, he founded an outstanding school of algebra in Berlin. Dismissed 
from his chair by the Nazi government, he took a position in 1938 at the Hebrew 
University in Jerusalem. 

Garsia and Milne’s bijective proof of the Rogers—Ramanujan identities is based 
on their involution principle: Let C = C* UC~, where C'™C™ = 4, be the 
disjoint union of two finite components CT and C~. Let @ and 6 be two involutions 
on C, each of whose fixed points lie in CT. Let Fy (resp Fg) denote the fixed-point 
set of a (resp 6). Suppose a(Ct — Fy) C C™~ and a(C~) C C* and similarly 
B(Ct — Fg) C C™ and B(C~) C Ct. Then a cycle of the permutation A = af 
contains either fixed points of neither a nor 6, or exactly one element of Fy and one 
of Fg. This powerful involution principle has been successfully applied to several 
q-series identities. Garsia and Milne’s proof of Rogers-Ramanujan was very long but 
soon afterward David Bressoud and Doron Zeilberger found a shorter proof.!* 

Now observe that in Euler’s theorem the parts are distinct and hence differ by at 
least one, whereas in MacMahon’s theorem the parts differ by at least two. If we 
denote by ga,m(n) the number of partitions of n into parts differing by at least d, each 
part being greater than or equal to m, the Euler and MacMahon theorems take the form 


dd,m(n) = Pd,m (n), 


where pg,m(n) is the number of partitions of n into parts taken from a fixed set Sg mn. 
H. L. Alder observed that for d = 1, m could be taken to be any positive integer. In 
fact, the number of partitions of n into distinct parts, with each part > m, was equal to 


12. Schur (1917). 

13° Garsia and Milne (1981). 

14 Gordon (1961). 

15 Bressoud and Zeilberger (1982). 
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the number of partitions of n into parts taken from the set {m,m+1,...,2m—1,2m+ 
1,2m + 3,...}. 

In 1946 D. H. Lehmer proved for m = 1, and in 1948 Alder proved for the general 
case: The number gq,m(n) is not equal to the number of partitions of n into parts taken 
from any set of integers whatsoever unless d = 1 or d = 2,m = 1,2. Now the 
generating function for gg,m(n) is easily seen to be 


lero) Pa ae 
’ (26.5) 
2 dl ya gage) 
while the generating for partitions with parts from a fixed set {a1,a2,a3,...} is 
1 


Wed — 4%) 


Alder’s proof consisted in showing that no matter how the ag were chosen, the two 
generating functions could not be equal for the values of m and d excluded by the 
theorem. !° 

When MacMahon interpreted the Rogers-Ramanujan identity in terms of par- 
titions, Hardy and Ramanujan may have been spurred to examine the asymptotic 
behavior of p(), the number of partitions of n. MacMahon assisted them in this work 
by constructing a table of p(n) forn = 1,2,...,200. We later consider the impact of 
this on the work of Hardy and Ramanujan. For now, we note that this table was created 
by means of Euler’s formula 


p(n) = pn — 1) + p(n— 2) — pa —5)— pa —7)4 


+(-1)""!p ¢ sma ») (-1)""'p (n= mam +1)) + 
(26.6) 


Note that p(k) = 0 for k negative. This formula is quite efficient for numerical 
work. Ramanujan enjoyed numerical computation and could do it with unusual rapid- 
ity and accuracy. It is therefore interesting that in his obituary notice of Ramanujan, 
Hardy wrote, “There is a table of partitions at the end of our paper .... This was, 
for the most part, calculated independently by Ramanujan and Major MacMahon; 
and Major MacMahon was, in general, slightly the quicker and more accurate of 
the two.”!” 

J. E. Littlewood once remarked that every positive integer was one of Ramanujan’s 
personal friends. '* Thus, Ramanujan noticed in the tables something missed by others, 
the arithmetical properties of partitions. In his 1919 paper on partitions he wrote, !° 


16 For this history and for good references, see Alder (1969). 
17 Ramanujan (2000) p. xxxv. 

18 Littlewood (1986) p. 61. 

19 Ramanujan (1919b). 
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On studying the numbers in this table I observed a number of curious congruence properties, 
apparently satisfied by p(n). Thus 


d) 
(2) 
(3) 
(4) 
(5) 
(6) 
(7) 


p(4), 
P(S), 
p(6), 
p(24), 
p(19), 
p(47), 
p39), 
(8) p(6l), 
(9) p(116), 
(10) p(99), 


PQ), pU4), pd9), ... =0 
p(12),  p(9),  p(26), =0 
pd7), p(28), p(B9),_... =0 
p(49), p(74),  p(99), =0 
p(54),  p(89), p24), ... =0 
p(96), p(145), p94), ... =0 
p(94),  p(149),_... =0 
p(138), =0 
=0 

=0 


(mod 5), 
(mod 7), 
(mod 11), 
(mod 25), 
(mod 35), 
(mod 49), 
(mod 55), 
(mod 77), 
(mod 121), 
(mod 125). 


From these data I conjectured the truth of the following theorem: If 6 = 5@ 711 and 242 = 1 
(mod 6) then 


pa), p(A +8), p(A +25),... =O (mods). 


Ramanujan gave very simple proofs of p(S5m + 4) = 0 (mod 5) and p(7m + 5) = 0 
(mod 7), using only Euler’s pentagonal number theorem and Jacobi’s formula for 
Wee ,;d — Gy? . Ramanujan’s further efforts, to prove p(25m + 24) = 0 (mod 25) 
and p(49m + 47) = 0 (mod 49), led him deeper into the theory of modular functions. 
In particular, he found the following two remarkable identities: 


{a — 9°) —@)d —qb)---pP 
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5) + p(2)q + p(19)q?7 +--+» =7 
pS) + pU2)q + pU9)¢q {d—gd—-qd—@)---}4 
yg, (4-0-9) — 97) 
Pt a= =a eae me 


The rest of Ramanujan’s conjecture concerning the divisibility of the partition 
function by 5“7711¢ is not completely correct. In 1934, on the basis of the extended 
tables for p(n) constructed by Hansraj Gupta, Sarvadaman Chowla observed?° that 
p(243) was not divisible by 73, though 24 - 243 = 1 (mod 73). However, p(243) is 
divisible by 77. The correct reformulation of Ramanujan’s conjecture would state: Let 
8 = 5479116 8’ = 547911, where b’ = b, if b = 0,1,2, and b’ = |(b + 2)/2], if 
b > 2. If 242 = 1 (mod 6), then 


p(A+nd)=0 (modd’), n=0,1,2,.... (26.9) 


In an unpublished manuscript, Ramanujan outlined a proof of his conjecture for 
arbitrary powers of 5. He may have had a proof for the powers of 7 as well, since 


20 Chowla (1934). 
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he apparently began writing it down.?! George N. Watson’s proof? of Ramanujan’s 
conjecture for powers of 5 is identical with the one contained in the unpublished 
manuscript. Watson also gave a proof of the corrected version for the powers of 7. 
In 1967, A. O. L. Atkin provided a proof for powers of 11,7° based on work of 
Joseph Lehner from the 1940s. Atkin and Lehner’s proofs require the use of modular 
equations, a topic in which Ramanujan was a great expert. It is remarkable that he 
was able to conjecture an essentially correct result on so little numerical evidence, 
especially in higher powers. 

The fact that p(5n + 4) = 0 (mod 5) suggests that partitions of 5n + 4 should be 
divisible into five classes with the same number of partitions in each class. Freeman 
Dyson got this idea around 1940 when he was in high school; as a second year student 
at Cambridge University, he found a way of making this division.”* For this purpose, 
he defined the concept of the rank of a partition: the largest part minus the number 
of parts. He checked this concept, applying it to the three cases p(4), p(9), and p(14) 
and found it accurate; he also found that it worked for p(5) and p(12). He conjectured 
its truth for all p(5n +4) and for p(7n +5), but was unable to prove it. A decade later, 
Atkin and Peter Swinnerton-Dyer found a proof” involving combinatorial arguments 
combined with ideas from modular function theory. Mock theta functions also made an 
appearance; Atkin and Swinnerton-Dyer rediscovered and used a number of identities 
for mock theta functions. Unbeknownst to them and the rest of the world, these 
identities were contained in Ramanujan’s lost notebook later discovered by Andrews, 
then buried under a mountain of paper on the floor of Watson’s study. 

The rank of a partition can be defined graphically as the signed difference between 
the number of nodes in the first row and number of nodes in the first column. Consider 
the ranks of the partitions of 5: 


Partition Rank 

5 5-l1=4 (mod7) 
4+] 42=2 (mod7) 
342 3-2=1 (mod7) 
34+1+1 3-3 =0 (mod7) 
24241 2-3=6 (mod7) 


2+14+141 2-4=5 (mod7) 
141414141 1-5=3 (mod7). 


Dyson found that the concept of rank failed to classify the partitions of 11n + 6; 
he conjectured the existence of a crank for this purpose. Almost half a century later, 
a day after the 1987 Centenary Conference at the University of Illinois, celebrating 
the work of Ramanujan, Andrews and Frank G. Garvan discovered the crank:7° The 
crank of a partition is the largest part in the partition if it has no ones; otherwise, 


21 Ramanujan (1988) pp. 238-243. Also see Berndt and Ono (2001). 
2 Watson (1938). 

23 Atkin (1967). 

4 Dyson (1944). 

5 Atkin and Swinnerton-Dyer (1954). 

26 Andrews and Garvan (1988a). 
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it is the number of parts greater than the number of ones, minus the number of ones. 
A nice property of the crank is that it works for 5, 7, and 11. Amazingly, Ramanujan 
discovered the generating functions for both the rank and the crank, and his results 
can again be found in his lost notebook,”’ though he did not use these names. 

Concerning the congruence properties of partitions, Ramanujan wrote, “It appears 
that there are no equally simple properties for any moduli involving primes other 
than these three.”*8 As we shall see, Ramanujan’s intuition has been shown to be 
correct. However, in the late 1960s, Atkin found some more complicated congruences 
involving other primes. For example, he showed that”? 


p(113.13n + 237) =0 (mod 13), 
p(23°.17n + 2623) =0 (mod 17). 


Atkin used computers to do the numerical work necessary for constructing these 
examples. In fact, Atkin was among the pioneers in the use of computers for number 
theory research. Concerning this aspect of his work, he wrote in his 1968 paper, “it 
is often more difficult to discover results in this subject than to prove them, and an 
informed search on the machine may enable one to find out precisely what happens.” 
Atkin’s aim was to understand partition identities, including Ramanujan’s, from the 
more general viewpoint of modular function theory. His student Margaret Ashworth 
(1944-73) shared this perspective, although her researches were halted much too 
soon. Thus, Atkin and Ashworth did not succeed in fully developing their approach. 
Atkin himself made important contributions to the theory of modular forms and in 
1970, Atkin and Lehner conceived the fundamental idea of new forms.?° These are 
eigenforms for Hecke operators, on the space of cusp forms for Hecke subgroups of 
the modular group. 

In fact, it was only recently that Ken Ono developed a theory of the kind Atkin may 
have been seeking. In 2000, Ono was able to prove that for any prime / > 5, there 
exist infinitely many congruences of the form p(An + B) = 0 (mod J). Soon after 
this, Scott Ahlgren extended the congruence to the case in which / is replaced by /*. 
Subsequently, Ono and Ahlgren jointly extended these results and wrote a historical 
essay explaining that their work “provides a theoretical framework which explains 
every known partition function congruence.”*! Ono and Ahlgren based their work on 
results in modular forms from the 1960s and 1970s due to Goro Shimura, Jean-Pierre 
Serre, and Pierre Deligne. 

Confirming another conjecture of Ramanujan, in 20037* Ahlgren and Matthew 
Boylan proved that if / is prime and 0 < 6 </ is any integer for which 


p(in+ B)=0 (mod!) forall n> 0, 


27 Ramanujan (1988) pp. 179-182. 
28 Ramanujan (1919b). 

29 Atkin (1968). 

30 Atkin and Lehner (1970). 

31 Ahlgren and Ono (2001). 
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then 


7, B)e{(5,4), (7,5), 11, 6)}. 


We note that all these cases of simple congruence were found by Ramanujan; his 
intuition that no other cases exist has been verified. In 2005, Karl Mahlburg succeeded 
in extending the partition congruences to the crank function.** Let M(m, N,n) be the 
number of partitions of n whose rank equals m (mod N). Mahlburg’s theorem states 
that for every prime / > 5 and integer i > 1, there are infinitely many nonnested 
arithmetical progressions An + B such that simultaneously for every 0 < m < 1/ — 1 


M(m,l/, An + B) = 0 (mod I'). 
It is clear from the definition of M that 
p(n) = M(0,N,n)+M(,N,n)+---+M(N — 1,N,n). 


Therefore, Mahlburg’s theorem implies the corresponding result for p(7). 

MacMahon and Hardy greatly admired Ramanujan’s generating function for 
p(5n + 4). A number of proofs of this and the generating function for p(7n + 5) 
have subsequently been found. A recent proof by Hershel Farkas and Irwin Kra is 
based on the theory of Riemann surfaces and theta functions.*4 

In the final year of his life, Ramanujan introduced a new type of series, mock 
theta functions. These q-series, convergent in |g| < 1, also have connections with the 
theory of partitions, although Ramanujan’s motivation was to study their asymptotic 
properties as g approached a root of unity. Ramanujan noted that the asymptotic 
behavior of theta series such as 


ee) n2 
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n=0 
and 
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q 
pda 0a) 
could be expressed in a neat and closed exponential form as gq approached roots of 
unity. He conceived mock theta functions as those series with similar asymptotic 
properties, without being theta functions. He gave seventeen examples of mock theta 
functions, dividing them into four groups, named mock theta functions of orders 3, 5, 
5, and 7. One of the third-order functions he mentioned was defined by 
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He noted that when g = —e~‘ andt > 0 
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Ramanujan also stated a few identities connecting some of these functions with each 
other. For example, he mentioned the third-order function 


q q’ 


l-qt+@ d-q4+@Qd-@tq)) 


x(q) = 14 


and the relation 


(1 — 29° + 2g —-+--)? 
(1 —q) — 4?) — q)- + 


After Ramanujan, G. N. Watson (1886-1965) was the first to study these functions. 
The title of his 1936 paper on this topic,*> “The Final Problem: An Account of the 
Mock Theta Functions,’ was borrowed from an Arthur Conan Doyle story. In this 
paper, Watson introduced three new third-order functions, and proved identities 
such as 


4x(Q -f@= 
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Watson employed the identities to show that the third-order mock theta functions 
had the asymptotic properties asserted by Ramanujan and that they were not theta 
functions. A year later, Watson proved that the fifth-order functions listed by Ramanu- 
jan had the asymptotic properties; he did not succeed in showing that they were not 
theta functions. Watson’s proofs of some of the identities were long, and he wrote 
that he counted the number of steps in the longest to be twenty-four instead of the 
thirty-nine he had hoped for as a student of John Buchan. 

Watson’s papers motivated Atle Selberg (1917-2007) to prove asymptotic formulas 
for seventh-order functions. Selberg had been drawn to a study of Ramanujan’s work 
by a 1934 article by Carl St6rmer in a periodical of the Norwegian Mathematical 
Society. The next year, Selberg started reading Ramanujan’s Collected Papers. In 
1987, he described his impressions:*° “So I got a chance to browse through it for 
several weeks. It seemed quite like a revelation — a completely new world to me, quite 
different from any mathematics book I had ever seen — with much more appeal to the 
imagination, I must say. And frankly, it still seems very exciting to me and also retains 
that air of mystery which I felt at the time. It was really what gave the impetus which 
started my own mathematical work. I began on my own, experimenting with what is 
often referred to as q-series and identities and playing around with them.” 


35. Watson (1936). 
36 Selberg (1989) vol. 1, pp. 695-706, especially p. 696. 
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In the 1960s, Andrews began his extensive work on mock theta functions. His work 
was further facilitated by his dramatic 1976 discovery of Ramanujan’s Lost Notebook 
in the Trinity College library of Cambridge University. For example, among myriad 
formulas in this notebook, Ramanujan gave ten identities for the fifth-order functions. 
In 1987, Andrews and Garvan showed that these ten identities could be reduced to two 
conjectures on partitions.*’ To state these conjectures, let R,(n) denote the number of 
partitions of n with rank congruent to a (mod 5). The first conjecture stated that for 
every positive integer n, R,(5n) — Ro(5n) was equal to the number of partitions of 
n with unique smallest part and all other parts less than or equal to the double of the 
smallest part. The second stated that 2R2(5n + 3) — Ri(S5n +3) — Ro(5n +3) — 1 was 
equal to the number of partitions of n with unique smallest part and all other parts less 
than or equal to one plus the double of the smallest part. A year later Dean Hickerson 
proved these conjectures.°8 

We mention in passing that as that as a biproduct of his work on mock theta 
functions, Andrews discovered the identity? 


lore) 3 oo 6(2n 2n?+2n—(/ 5") Qn 
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An immediate consequence of this formula is that every positive integer can be 
expressed as a sum of at most three triangular numbers. This theorem was first stated 
by Fermat, who said he had a proof. The first published proof appeared in Gauss’s 
Disquisitiones. 

Though mock theta functions were shown to have connections with several areas 
of mathematics, it was not clear how they fit into any known general framework. The 
work of Sander Zwegers, Don Zagier, Ken Ono, and Kathrin Bringmann, 2002-2007, 
has shown that Ramanujan’s twenty-two mock theta functions are examples of infinite 
families of weak Maass forms of weight 5. This understanding has led to further new 
results. 


26.2 Sylvester on Partitions 


In 1882, Sylvester collected together the investigations he and his students had done 
on partitions dating from 1877-1882 and published them in his newly founded journal 
as a long paper,*? “A Constructive Theory of Partitions, Arranged in Three Acts, 
an Interact and an Exodion.” He presented Franklin’s proof of Euler’s pentagonal 
number theorem (26.2). Sylvester placed the smallest part at the top of his graphical 
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representation. We present the proof in his own words, illustrating his habit of using 
periods very sparingly. 


If a regular graph represent a partition with unequal elements, the lines of magnitude must 
continually increase or decrease. Let the annexed figures be such graphs written in ascending 
order from above downward: 


oe e @ (A) 


(B) ne) 


In (A) and (B) the graphs may be transformed without altering their content or regularity by 
removing the nodes at the summit and substituting for them a new slope line at the base. In C 
the new slope line at the base may be removed and made to form a new summit; the graphs so 
transformed will be as follows: 


oe ee e @ (A) 


(B’) ia we 8 ee e (Cc) 


A’ and B’ may be said to be derived from A, B by a process of contraction, and C’ from C by 
one of protraction. 

Contraction could not now be applied to A’ and B’, nor protraction to C’ without destroying 
the regularity of the graph; but the inverse processes may of course be applied, namely, of 
protraction to A’ and B’ and contraction to C’, so as to bring back the original graph A, B,C. 

In general (but as will be seen not universally), it is obvious that when the number of nodes 
in the summit is inferior or equal to the number in the base-slope, contraction may be applied, 
and when superior to that number, protraction: each process alike will alter the number of parts 
from even to odd or from odd to even, so that barring the exceptional cases which remain to 
be considered where neither protraction nor contraction is feasible, there will be a one-to-one 
correspondence between the partitions of n into an odd number and the partitions of n into an 
even number of unrepeated parts; the exceptional cases are those shown below where the summit 
meets the base-slope line, and contains either the same number or one more than the number 
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of nodes in that line; in which case neither protraction nor contraction will be possible, as seen 
in the annexed figures which are written in regular order of succession, but may be indefinitely 


continued: 
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for the protraction process which ought, for example, according to the general rule, to be 
applicable to the last of the above graphs, cannot be applied to it, because on removing the nodes 
in the slope line and laying them on the summit, in the very act of so doing the summit undergoes 
the loss of a node and is thereby incapacitated to be surmounted by the nodes in the slope, which 
will have not now a less, but the same number of nodes as itself; and in like manner, in the last 
graph but one, the nodes in the summit cannot be removed and a slope line be added on containing 
the same number of nodes without the transformed graph ceasing to be regular, in fact it would 
take the form 


which, although regular, would cease to represent a partition into unlike numbers. The excepted 
cases then or unconjugate partitions are those where the number of parts being j, the successive 
parts form one or the other of the two arithmetical series 


BItVGH+2,...,27-1 of f+ijt+2,..., 2), 


2. 
in which cases the contents are ag 9) J and 


On te 

2 sti respectively, and consequently since in the 

product of 1 —x-1— x2.1—x3--. the coefficient of x” is the number of ways of composing 

n with an even less the number of ways of composing it with an odd number of parts, the product 
. 324i 

will be completely represented by )~°° (-1)/ x = 


j=-00 


Sylvester’s student Durfee introduced the important concept of the Durfee square for 
the purpose of studying self-conjugate graphs. These graphs remain unchanged when 
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rows of nodes are changed to columns. Sylvester gave the partition of 27 = 7+ 7+ 
44+3+4+2+42+42as an example;*! it has the self-conjugate graph: 


Note that the largest square in this graph is of size 3 x 3 in the upper left corner and 
the remaining nodes form two graphs with nine nodes each, partitioned into identical 
partitions, 3 + 2+ 2+ 2, provided the nodes on the right-hand side of the square are 
read column-wise. The number of partitions of 9 in which the largest part is at most 3 
is the coefficient of 

14 1 


C= = = ey 


and this is the same as the coefficient of 


x!8 in L 
(1 — x2)(1 — x4)(1 — x9)’ 


Sylvester applied this analysis to find the number of self-conjugate partitions of n. 
He considered all the partitions that could be dissected into a square of size m*. The 
number of such partitions would be the coefficient of 


n—m2 1 
(1 — x2)(1 — x4)--- (1 — x2") 


x in 


or the coefficient of 
xn 
(1 — x2). —x4)--- (b= x2) 


”" in 


Thus, the number of self-conjugate partitions of n was the coefficient of x” in the 
series 


x x4 x? 


'J-x2 ° 1 — x2) — x4) | d — x2)(1 —x4)(1 — x9) 


There is yet another manner in which a self-conjugate partition can be dissected: 
by counting the number of nodes in the m angles or bends, as Sylvester called them. 


41 ibid. § 27. 
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Thus, for the self-conjugate partitions of 27, there are three bends. The outermost right 
angle has thirteen nodes; the second has eleven; and the third three. It is easy to see 
that the number of nodes in each right angle of a self-conjugate partition will always 
be an odd number. Moreover, different right angles in the same partition will have 
different numbers of nodes. Thus, the number of self-conjugate partitions of n will be 
the coefficient of x” in (1 + x)(1 +.x3)(1 + x°)---. Therefore 


(oe) n2 


~ Mn+1y _ x 
[Ja+: "= 2 (Sa: 


n=0 


Sylvester generalized this analysis of self-conjugate partitions” by introducing an 
additional parameter a, whose exponent registered the number of parts in a partition, 
concluding that the coefficient of x"a/ in (1 +ax)(1+ Be yes et ax*i-!)... was 
the same as in 


xP ai 
(1 — x2)(1 — x4)--- (1 — x2/) 


Thus, he had Euler’s formula 


(oe) 


0° nn 
2n+ly _ xa 
] Ja +ex = ad 


n=0 


but by a combinatorial argument. 

By means of a Durfee square analysis, Sylvester also obtained the identity needed 
by Jacobi to complete his proof of the triple product identity.47 Thus, in Jacobi’s 
formula 


CO [o,@) 


4 =aq") — aah > (l—q)(l—q?)---(—q™) (1—aq)(1—aq?)---(l—ag™)’ 


m=0 
the factor 
gan 
(l—q)(—4q?)---d—q™) 


accounted for the square and the nodes to the right of it, while 


1 
(1 — aq)(1 — aq’)--- (1 — aq”) 


did the same for the nodes and the number of rows below the square. It is an interesting 
and instructive exercise to work out the details. 


42 ibid. § 28. 
43 ibid. § 33. 
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Sylvester demonstrated that graphical analysis could also be used as a tool for the 
discovery of new identities. As an example, he presented*4 


lee) ad +ax)(1 + ax”) as el + ax®*-“Ha ae ax2%\qrx a oO : 
dX C= 90 = 22) 44) =e ). 


(26.10) 


Briefly, Sylvester considered all partitions with distinct parts to account for the 
product on the right-hand side. To obtain the series on the left-hand side, he considered 
a graph of an arbitrary partition of n with distinct parts. He supposed that the Durfee 
square (the largest square of nodes in the upper left corner) had 97 nodes. Again, there 
were two subgraphs, called by Sylvester appendages: one to the right of the square 
with either 6 or 9 — 1 rows and with unrepeated parts; and one below the square with 
J —9 rows and with unrepeated parts. Moreover, since the parts were distinct, Sylvester 
observed that the subgraph below the square had the largest part, at most @ or 6 — 1, 
depending on whether the subgraph to the right had 6 or 6 — 1 rows. In the first case, 
because 1+2+---+0= oor by. the number of distributions was the coefficient of 


ae | 


92 
xf a 


(6740) 
x 2 


(1 —x)(1 —x2)---(1— x9) 


oa WL eas heh 1 hig”): 


. y : eiADs ae ps 

in the second case, it was the coefficient of x”~*% a/~® in 
(62-6) 

xX. 2 


+ ax)... 6-1 
(1—x)0 —x2)---( — x9-1) “C-baxyd axe (ex: 


By adding these two expressions, Sylvester obtained the @th term of his series and 
this proved the formula. Note that Euler’s pentagonal number theorem follows from 
Sylvester’s formula when one takes a = —1. Sylvester commented, “Such is one of 
the fruits among a multitude arising out of Mr. Durfee’s ever-memorable example of 
the dissection of a graph (in the case of a symmetrical one) into a square, and two 
regular graph appendages.” 


26.3 Cayley: Sylvester’s Formula 


In 1882, Sylvester’s mathematical correspondent and comrade, Cayley, responded to 
his friend’s great paper on partitions by showing how the “very beautiful formula” 
(26.10) could be proved by an interesting analytic method. He expressed the series 
side of the formula as* 


44 ibid. § 35. 
45. See Cayley (1889-1898) vol. 12, pp. 217-219. 
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Q=14+P+Q(1 +ax)+ RU +ax)(1 +ax? 


+ S(1+ax)(1 +ax2)(1 +ax3) +--- (26.11) 
where 
p= (1 + ax?)xa _ Us ax*)x>a2 
= 1 ig oe < 
1 6) 2:12.43 l 8) 224 
i ea Pe (26.12) 
1.2.3 1.2.3.4 
where the numbers in bold 1,2,3,4,... denoted 1 — x, 1 —x?,1—x3,1—x4,.... 
Cayley observed that the x exponents 1,5,12,22,... were the pentagonal numbers 
or), Cayley then set 
pre oe QO! = ax | aext p-@ | ax? | ax “ete 
1 1 1.2 1 1-2 1-2-3 


where the x exponents were 
2; 3,34+4 4.445,44+54+6:; etc. 
He then noted that it was easily verified that 


1+P=(1+ax)(1+ P’), 
1+P'+Q=(1+ax’)d+ Q, 
1+0°'+R=(1+ax)1+R), 
1+R)+S=(1+ax*\1+58’), etc. 


Cayley concluded from these relations that 


Clap SHt +P 40+ Rd faery eS Sar er) eee, 


Q+(1+ax) +ax?)=1+0'+R4S0 +ax3) +70 +ax9)\ +ax4)+---, 


Q=+ (1 +ax) + ax +a) =14R'4+594T0 +ax*)+--, 


and so on. When this was done infinitely often, the right-hand side would become 1, 
giving Cayley the complete proof of Sylvester’s theorem. 

Andrews pointed out that Cayley’s method is more easily understood in terms of 
Euler’s method of proving the pentagonal number theorem. In Sylvester’s series, take 
the factor 1 + aq?” in the (n + 1)th term, counting | as the first term, and split it into 
two parts: 


Pax S lox” +370 4ax. (26.13) 
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Interestingly, this breaks the (n + 1)th term into two parts, so that one can associate 
the second part of the term with the first part of the succeeding term. The result of this 
association is 


—1) 


(tax) tax?) pax" ¢ax")x"a"x 
d—-x)d —x?)..-(1 — x") 


n+1)(3n+2) 
ee 


Ud +ax)d + ax*)---(1+ ax™)qntl,! 
(1 —x)(1 — x2)---(1 — x") 


1 2)... n n(3n 
(1 + ax~) ( + ax") ny LL 


4 2n+1 
(i —x)---d—x*) 1+ax™"). (26.14) 


=(1 } ax) - 


Now observe that the factor multiplying 1 + ax is the (n + 1)th term of Sylvester’s 
series except that a has been replaced by ax. So, if we denote Sylvester’s series by 
f (a), then by (26.14) we have 


f(@ = (+ ax) f (ax) 
= (1 +.ax)(1 + ax’) f (ax’) 
= (at areas 


Note that for convergence we would require |x| < 1, and therefore x” — Oasn — oo. 
This implies 


im, f(ax") = fO) = 1. 


We note that Gauss also used this method on various occasions and that it is possible 
that Cayley rediscovered Euler’s method. We shall see in the next section how 
Ramanujan made brilliant use of this technique to prove the Rogers—Ramanujan 
identities. 


26.4 Ramanujan: Rogers—Ramanujan Identities 


Ramanujan discovered a new proof of the Rogers-Ramanujan identity after he saw 
Rogers’s original proof, presented in Chapter 27. Ramanujan communicated the proof 
to Hardy in a letter of April 1919 and Hardy had it published in a 1919 paper.*° 
In this paper, Hardy also included another proof, sent by Rogers to MacMahon in 
October 1917. Ramanujan’s proof started with a series very similar to Sylvester’s 
series (26.10): 


on, (1 — xq)(1 — xq7)-+- (1 — xg""!) 
xq~) 
(1-4) -4q?)---(l—q") 
(26.15) 


ee) 
G(x) = 14 SO yrxghOr-D 


n=1 


46 Ramanujan (1919a). 
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He split this series into two parts, exactly as Cayley had done with Sylvester’s series, 
by applying 


1—xg*" = 1—q" +4" — xq") 
to transform (26.15) into 


jms 


= (5n2—9n-+4) (1 —xq)-:--d—xq 
CH= yg = Care) (26.16) 


(l1—q)---(—q"}) ’ 


n=1 
where the empty product when n = | was set equal to 1. Ramanujan then set 


CO) _ G¢xq) 
l1—xq 


H(x) = 


and used the value of G(x) from (26.16) and of G(xq) from (26.15) to obtain 


H(@)=xq-7-7(@ q) +xq*(1 — xq°)) 


xg ML xg") 
d-pdaaq q?)+xq"(1 xq?) 
x°q4(1 — xq*)(1 — xq3) 
d=gd—9¢7)d—¢*) 


(1 — q3) +.xq'd —xq4)) +-:-. 


Again, as in Cayley’s argument, Ramanujan associated the second part of each term 
with the first part of the succeeding term to arrive at the relation 


H(x) = xq(1 — xq?)G(xq") 


or 


G(x) = (1 — xq)G(xq) + xq — xq) — xq7)G(xq”). 
Setting 


G(x) 
(1 — xq)(1 — xq?)(1 — xg3)---’ 


F(x) = 


Ramanujan obtained the relation 
F(x) = F(xq) + xqF(xq’). (26.17) 
He observed that it readily followed that 


xq x2q4 x399 


F =14 T } 
me fq (=t=g Gnd dw) 


t-++, (26.18) 


54 Partitions 


an equation that follows from 
F(x) =1 + Ai(q)x + A2(q)x* + A3(q)x? fee. 


Applying (26.17), he obtained 


go 
An(q) = ——An-1(q), n= 1,2,3,..., 
l=q 
where Ao(q) = 1. This implied (26.18). Ramanujan obtained the required identities 
by taking x = 1 and x = q in (26.18). For x = 1, he got 


ae ae qe Cee G(1) 
"1l-q' @—gd—@) ' (l—q)( — 9q2)(1 — g3)-- 
1 ag? q? q? ge 


(ag =a lag ne. 


The series can be converted to a product by the triple product identity and the result 
follows. 


26.5 Ramanujan’s Congruence Properties of Partitions 


Ramanujan was the first mathematician to study the divisibility properties of the 
partition function. In a paper published in 1919,4” he gave fairly simple proofs of 
the congruence relations p(Sm + 4) = 0 (mod 5) and p(7m + 5) = 0 (mod 7). He 
started with Euler’s generating function for p(n), 


q 
(l1—q)(l—q7)0 -—¢ 


ces > p(n — 1a", (26.19) 
n=1 


and observed that the first congruence would follow if the coefficient of g>” on the 
right-hand side were divisible by 5. Thus, it was sufficient to show that the same was 
true for the coefficient of g*” in 


qd —q)d—g™)d-q')--»  gd-@)d—q').-- 


2)..,44 
(1-4). — 47) — q3)--- > a= aibaaeae gq): F. 


Ramanujan then noted that | — ae" =(1—q")> (mod5) and hence 


Gad) Gag dag") 


M=ne=oda7 es ee 


47 Ramanujan (1919b). 
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Thus, to prove that p(5m + 4) = O (mod 5), it was enough to show that the 
coefficient of g>” in 


gil —g)d — 47) — q3)---}4 
= {1 —@d—¢q0 —¢3)--- Pd —@d -¢@)d —-¢q)--- 


(oe) (oe) 
m(m+1) n(3n+l1) 
=q>)\Qm+lq 2 Yo (-l"q ? 


m=0 n=—OOo 


was divisible by 5. Observe that in the last step, Ramanujan used Jacobi’s identity and 
Euler’s pentagonal number theorem. Jacobi’s identity can be derived from the triple 
product identity. He then noted that the exponent of g in the double sum was divisible 
by 5 when 
14 ae! 1 (Seey) =0 (mod5), or 
2 2 
8+4m(m+1)+4n(3n+1)=0 (mod5), or 


(2m +1)? +2(n +1)? =0 (mod5). 


Ramanujan noted that (2m + 1)* = 0, 1,4 (mod 5) and 2(n + 1)? = 0,2,3 (mod5). 
So when the exponent of g was a multiple of 5, then 2m+1 = 0 (mod 5) andn+1=0 
(mod 5). Since the coefficient of this power of g was 2m + 1, a multiple of 5, 
Ramanujan’s proof was complete. He gave a similar proof for the congruence 
modulo 7; see the Exercises. 

In the same paper, Ramanujan outlined a proof of (26.7). He intended to publish 
more details at a later date, but his premature death made this impossible. However, 
he wrote notes giving these details, later found and published as part of his lost 
notebook.*® In fact, this proof used only the pentagonal number theorem, Jacobi’s 
identity 


CO CO 
[la -¢ = ev" em + pg (26.20) 
n=l m=0 
and fifth roots of unity. By the pentagonal number theorem: 
Aer) 
[la -48) = se 1)"q (26.21) 


n>=—OoO 


Partition the series into five parts according to whether n = 0, +1, +2 (mod 5). For 
example, for n = 0 (mod 5), we have n = 5m and the part of the series corresponding 
to these values of n would be given by 


mitt) 
0" 


m>=—-CO 


48 Ramanujan (1988) pp. 238-239. 
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Note that the subseries corresponding to n = 5m — 1 can once again be expressed 
a product: 


[oe] 
(Sm—N)5m—2) ait a 2) 1 5m(3m—1) 
Ss (-1)"q =-@g@ )) (-l"q 2 


m>=—-C m>=—-CO 


~q3 [Ja-4™). 


n=l 


Thus, (26.21) can be written as 


Te qi ae 3 (- 1"q mee + S eS 1)"q (Gm—1)(Sm—2) Dons 2) 


m>=—-CO m=—Oo 
q3 Baer) Omer) mnt) 
+a (¥ pa 3 (-)"q 
M>=—CO m=—Oo 
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—q3][a-49™). 
n=1 


Dividing by []°2,(1 — q>”) gives 


I ( i) bq —€q3, (26.22) 


n=1 Lg 


where & and &; are power series in g. Ramanujan applied Jacobi’s identity to show 
that €&; = 1. So cube both sides of (26.22) and use Jacobi’s identity (26.20) to get 


re y(—D"(2n + Dq™ 
|W bate et = gon) 


Since the exponent of g, given by n(n + 1), is either 0, 2, or 6 (mod 10), it follows 
that no power of q is of the form 24 an integer. This implies that the term 


=(&-¢) eq’) . (26.23) 


2 52 2 
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on the right-hand side of (26.23) must be zero. This in turn implies that &; = & —! and 
we can write 


af Lg 1 
I] ( 7 = = (26.24) 
En ge 3 


Consider the expression a7! —} —1, where A = E qiw, and w is a fifth root of 
unity. Observe that if 4~! — A = 1, then by an elementary calculation A~> — 4° = 11. 
Thus, 
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g$ — 11g - 854? =] (€°' - gh w" - egw) 
k=0 


It is now easy to check by long division that 


LO Geer 


n=l 


= Le 8 2 29 e-2 _ 3 (ag 4 
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4 
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Now multiply across by qs and replace qs by q Pa k = 1,2,3,4, to obtain five 


identities. Next, apply 
: [o.@) CO 
qs | [d-45)"! = 5 pags 
n=1 n=1 
and add the five identities to get 

CO CO 5 

1-q" Sn +4)q" = 26.25 
[[a-4 Neon a ET (26.25) 


n=l 


By replacing qs by es q3, k = 0,1,2,3,4, and multiplying the five equations 


together, Ramanujan arrived at 
ii ie) (26.26) 
ee ede gt Eady = gee. 
Combining (26.25) and (26.26) gives the necessary result. 
In his paper, Ramanujan also noted that 
+3) 
: (26.27) 


4_p7_a-a"*)a-¢" 
= I] (i= gently a qo 


n=0 
This can be proved by using (26.24), the pentagonal number theorem and the quintuple 
product identity. Ramanujan observed in his paper that (26.7) implied that p(25m+24) 


4) 


was divisible by 25. He argued that by (26.7), 
p(4)x + pQ)x? + p(4)x? ++. 
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and since the coefficient of x°” on the right-hand side was a multiple of 5, it followed 
that p(25m + 24) was divisible by 25. It is interesting to see how Ramanujan 
used (26.27) to compute the Rogers-Ramanujan continued fraction as well as the 
generating function for p(5m + 4). 


26.6 Exercises 


(1) Prove that the number of partitions of into parts not divisible by d is equal 


(2 


(3 


wm 


wm 


to the number of partitions of n of the form n = n, + n2+--- +s, where 
nj > nj4, and nj > nj+qg—1 + 1. See Glaisher (1883). James Whitehead Lee 
Glaisher (1848-1928) single-handedly edited two journals for over forty years: 
Messenger of Mathematics and Quarterly Journal. The Messenger carried the 
first published papers of many English mathematicians and physicists of the 
late nineteenth century, including H. F. Baker, E. W. Barnes, W. Burnside, 
G. H. Hardy, J. J. Thompson, and J. Jeans. Glaisher published almost four 
hundred papers, many of them in his own journals. G. H. Hardy wrote in his 
obituary notice, “He wrote a great deal of very uneven quality, and he was 
‘old fashioned’ in a sense which is most unusual now; but the best of his 
work is really good.” This best work included results in number theory and, 
in particular, the representation of numbers as sums of squares. See vol. 7 of 
Hardy (1966-1979). 

Complete the following number theoretic proof, due to Glaisher, of Euler’s 
theorem that the number of partitions of n into odd parts equals the number of 
partitions of n into distinct parts. Let 


n= fi-1+ fg-3+-+++ fom—1-(2m — 1). 


Here /f, f3,... represent the number of times 1,3, ..., respectively, occur in 
the partition of n into odd parts. Now write f\, /3,... in powers of two: 
fis 2 $22 4. $24, 
FO DP ep ce aes 
Then 
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gives a partition of n into distinct parts. See Glaisher (1883). 

Show that the number of partitions of n into odd parts, where exactly k 
distinct parts appear, is equal to the number of partitions of n into distinct 
parts, where exactly k sequences of consecutive integers appear. Show that this 
correspondence is one-to-one. This result was published by Sylvester in 1882. 
See Sylvester (1973) vol. 4, p. 45. 
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(4) Let px,-(n) denote the number of partitions of n into parts not congruent to 0, 


(5 


(6 


(7 


wm 


wm 


—_, 


tr (mod 2k +1), where 1 < r < k. Let gx,-(n) denote the number of partitions 
of n of the form n = ny +12 +--+-+ns where ny > nj41, nj; > Ni4en-1 +2 
and with | appearing as a part at most r — 1 times. Prove that then 


Per (1) = dk,r (Nn). 


See Gordon (1961). 
Prove that if p,,(m) denotes the number of partitions of n with rank m, then 


oe) 
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See Atkin and Swinnerton-Dyer (1954). 

Derive Szekeres’s combinatorial interpretation of the Rogers—Ramanujan con- 
tinued fraction. Let B(n.k), n > 1, k > O represent the number of sequences 
of integers b} < bp <---> < by =n with b} > i forl <i < n and 
by +bo+-+-+by_1 = (5) +k. Observe that B(n,k) = OforO0 <k <n—-1 
and for k > (5). Also note that B(n,n — 1) = B (n, (5)) = 1, B(I,0) = 
Now show that 


n-1 (grrr = geo?) gil, 


2 
Xx qx qx n+1 n yk 

SS —] B(n,k : 
ee ) (—1) (n,k)x"q 
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See Szekeres (1968). George Szekeres (1911-2005) was trained as a chemical 
engineer in Hungary but his association with Paul Erdés, Esther Klein, and Paul 
Turan turned his interest to mathematics. In 1935, Erdés and Szekeres wrote a 
paper laying the foundation of Ramsey theory. This arose out of Szekeres’s 
efforts to solve a problem (proposed by Klein who later became his wife): For 
all n there exists N such that for any N points in a plane there are n which 
form a convex n-gon. Szekeres and his family escaped to China from the Nazi 
government in Germany and moved to Australia after the war. His presence 
gave a boost to the development and teaching of mathematics in Australia 
where he was greatly admired. 

Andrews gave an interpretation of the Rogers-Ramanujan continued fraction 
different from that of Szekeres. Let 


C@=1+24+..= 
(q) as ee = esa 
m>0 


Also let By q(n) denote the number of partitions of n of the form n = b, + 
bz +---+ bs, where bj > bj41, bj — bi4x—1 => 2 and at most a — | of the 5; 
are equal to one. Then prove that 


C5m = B37,37(m) + B37,13(m — 4), 
C5m+1 = B37,32(m) + B37,7(m — 6), 
C5m42 = —(B37,23(m — 1) + B37,2(m — 8)), 
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C5m4+3 = —(B37,28(m) + B37,22(m — 1)), 
C5m4+4 = —(B37,17(m — 2) + B37,8(m — 5)). 


Show that it follows that, in particular, cg = cg = co = O and that the remaining 
Cn Satisfy 


C5n > 0, C5m41 > 9, C5m42 < 0, C5m43 < 0, C5m44 < 0. 


See Andrews (1981). 
Prove Ramanujan’s result that p(7m + 5) = 0 (mod7) by the following 
method. Square Jacobi’s identity to obtain 


[oe] [o,2) [o,6) 
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Now show that the coefficient of x’” in the sum is divisible by 49. Next observe 
that 


Let 
— = 1 (mod7) 
(B=)! 
and deduce that the coefficient of x” in qi is a multiple of 7. See 
n=|\"— 


Ramanujan (2000) p. 212. 

For a partition z of n, let A(t) denote the largest part of z; let z(7r) denote the 
number of ones in z; and let v(z7) denote the number of parts of z larger than 
(u(t). The crank c(zr) is defined as 


c(w) = A(z) if w(t) = 0; c(t) = v(t) — w(t) if w(t) > O. 


Let M(m,n) denote the number of partitions of m with crank m. Then prove 
that 


= S > M(m,n)a"q" = (4; Doo 


m=—oo n=0 - (4q34)00(4£;q)oo 


See Andrews and Garvan (1988a). 


26.7 Notes on the Literature 


For more on partitions, see Andrews (1998). Ramanujan (2000) contains his published 
papers on partitions; Berndt has also added seventy pages of helpful commentary at the 
end of the book, where one will find references to other works on Ramanujan’s papers. 
The theory of modular forms has been extensively used in recent years to study the 
partition function; K. Ono and his students and collaborators have been leaders in this 
area. Modular forms are also important in the study of many arithmetical problems. 
For example, Fermat’s last theorem is a consequence of the Shimura-Taniyama 
conjecture. See Shimura (2008) for an account of how he arrived at this conjecture. 
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q-Series and q-Orthogonal Polynomials 


27.1 Preliminary Remarks 


In the early nineteenth century, g-series proved their worth with their broad applicabil- 
ity to number theory, elliptic and modular functions, and combinatorics. Nevertheless, 
as late as 1840, no general framework for the study of g-series had been established; 
although q-series had been used to solve problems in other areas, it had not become 
a subject of its own. Finally, g-series came into its own when it was viewed as an 
extension of the hypergeometric series. As discussed in Chapter 26, in the 1840s 
Cauchy, Eisenstein, Jacobi, and Heine each presented the g-binomial theorem for 
general exponents. In 1846, Jacobi wrote a paper stating a q-extension of Gauss’s 
2F summation formula.! His interesting proof was based on Schweins’s q-extension 
of the Vandermonde identity, a terminating form of Gauss’s summation. The latter 
result gave the value of the series in terms of gamma functions. So Jacobi suggested a 
q-analog of (a): 


(1-4) — 4?) —q?)-:- 
(1 = gta = g@t2)(1 —s g?*3) ne 


Q(g,a) = (27.1) 

It is interesting that at the end of his 1846 paper, Jacobi wrote a lengthy 
historical note mentioning the 1729 letter from Euler to Goldbach, containing Euler’s 
description of his discovery of the gamma function by the use of infinite products. 
Jacobi had a keen interest in the history of mathematics, and some of his papers 
contain very helpful historical information. With Jacobi’s work, the stage was set to 
obtain the g-extension of the hypergeometric series and this was soon accomplished 
by Heinrich Eduard Heine (1821-1881) who studied in Géttingen under Gauss and 
Stern and in Berlin under Dirichlet. Heine received his doctoral degree under the 
supervision of Dirksen and Ohm in 1842. He then spent a year in K6nigsberg with 
Jacobi and Franz Neumann. It is most likely that Jacobi encouraged Heine to work on 
hypergeometric series and its g-extension; Heine later edited a posthumous paper of 
Jacobi on this subject. 


! Jacobi (1846). 
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In his 1847 paper defining the g-hypergeometric series,” Heine developed proper- 
ties of the series (a, 6, y,g,x) defined by 


_O=490=9) | = ad 9%) =a) =P) 


(—q)d— 4”) C=Ma=qi-qot=gr) 


(27.2) 


Observe that as g — 1 , this series converges term by term to 


a6 ala +l)-BB+1) 5 
ley ') 1-2-y(v+D 


14 


| 
Teeny 


the hypergeometric series F(a, 8, y,x). Heine took many results from Gauss’s 1813 
paper on hypergeometric series and extended them to the q-series @. He listed the 
contiguous relations for ¢, from which he derived continued fractions expansions for 
ratios of g-hypergeometric series, and he gave a very simple proof of the g-binomial 
theorem. The notation Q(g,a) for the gamma function analog is also due to Heine, 
and as an analog of Gauss’s F(a, b,c, 1) sum, he presented 


Qiqa,y -YDAiqy-—a—B-1) 
Q(q.y —a-1Q(q,y-—B-V 


Heine applied the g-binomial theorem to obtain an important transformation now 
known as Heine’s transformation: 


dag y= gethy= ( _ d=) = hx) 
(1 — g®x)(1 — g8t1x)--- (1 —q)d —q’x) : 
ae ea a= ghia) 5... ) 
"(1 =4@)d — 40 — 4’ — g?*1x) | 
eo Gay. 
= = et Sage es © (l= qd = q%z) 
Ree Pda iS =e. x?) 
© (1=9) = 42) = g#z) (1 = g*t1z) , 


He also defined a q-difference operator and found the second-order difference 
equation of which the hypergeometric equation was a limiting case. However, he did 
not define a q-integral. 

The origin of the transformation (27.4) remained a puzzle; to which result 
in hypergeometric series did this correspond? C. Johannes Thomae (1840-1921) 
answered this question. Thomae studied at the University of Halle and was inspired by 
Heine to devote himself to function theory. Thomae moved to Gottingen in 1862 with 
the intention of working under Riemann who soon fell seriously ill. Thomae stayed 
on and in 1864, he earned his doctoral degree under Schering, one of the editors of 
Gauss’s collected works. Thomae then returned to teach at Halle for a few years before 


b(a,B.y.q.q" * *) = (27.3) 


(27.4) 


2 Heine (1847). 
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moving to Freiburg and then to Jena. Thomae wrote his paper on Heine’s series? in 


1869 while at Halle. In this paper he defined the g-integral and showed that Heine’s 
transformation was actually the g-extension of Euler’s integral representation of the 
hypergeometric series. Thomae defined the g-integral by 


ag! Ax is S 
/ i ore ag 2 fq’). (27.5) 


He explained that the integral was the inverse of the difference operator 
Af (x) = fq) — f(x), (27.6) 


also noting that it was better to define the g-gamma, or more correctly, the q-I 
function by 


M(a,q) = (1—q) “ &(q,a). (27.7) 


Indeed, with this definition, 


ae Pe (aE (27.8) 
—q 


1 
Gg) 


and II(a,g) > I(a) = T(@a+ 1) as g — 1”. He also observed that the g-binomial 
theorem was equivalent to 


ar. T(«,q) T1(B,q) 
As = oF 
‘i s* p(B.sq) As = — ee 27.9) 
where 
2 a Sh 
jeje oe ee (27.10) 


(l= xq®)(1 = xgh+!)(1 — xght?)-- 


Moreover, he noted that as g — 17, formula (27.9) reduced to Euler’s beta integral 
formula 


1 
i s*(1 —s)P ds = EO). 
0 (a+ B+ 1) 


About forty years later, the able amateur mathematician Frank Hilton Jackson 
(1870-1960) redefined the q-integral and the g-gamma function.* He took the g- 
integral to be the inverse of the q-derivative 

b(qx) — O@) 


3 Thomae (1869). 
* Jackson (1910). 


64 q-Series and q-Orthogonal Polynomials 


so that the g-integral amounted to 


[ Feo dex =¥ laa" a" — aq"* (17.12) 


n=0 


The reader may observe that the expression on the right is the Riemann sum for the 
division points a, aq, aq’, ... on [0,a]. Note that Fermat integrated x a, where m and 
n were integers, by first evaluating its g-integral and then letting gq — 17. Jackson’s 
notation for the g-gamma, I’, (x), is still in use; he set 


Tg(x) = MW — 1,4). (27.13) 


Jackson was a British naval chaplain; he apparently had a little difficulty in pub- 
lishing some of his earlier works because their significance was not quite clear to 
referees. Jackson’s lifelong program was to systematically develop the theory of q- 
hypergeometric (or basic hypergeometric) series by proving analogs of summation and 
transformation formulas for generalized hypergeometric series. Jackson conceived of 
q as the base of the series, analogous to the base of the logarithm. His terminology 
is now widely used. In recent years, Vyacheslav Spiridonov has worked out another 
important and productive generalization of hypergeometric functions, elliptic hyper- 
geometric functions.” A formal series, 5 Cn» is called an elliptic hypergeometric 
series if watt = h(n), where h(n) is some elliptic function of n € C. 

Leonard James Rogers (1862-1933) gave a new direction to q-series theory through 
his researches in the early 1890s. Rogers studied at Oxford where his father was 
a professor of political economy. As a boy, Rogers was tutored in mathematics by 
A. Griffith, an Oxford mathematician with a strong interest in elliptic functions. 
Rogers’s earliest work was in reciprocants, a topic in invariant theory. The second 
half of Sylvester’s 1886 Oxford lectures on this subject were devoted to the work 
of Rogers. Around this same time, Rogers’s interest turned to analysis and to the 
topic in which he did his most famous work, theta series and products and, more 
generally, qg-series. In his Royal Society obituary notice of Rogers, A. L. Dixon 
recalled attending an 1887 course of lectures at Oxford in which Rogers manipulated 
q-series and products with great skill. In the period 1893-95, after a study of Heine’s 
1878 Kugelfunctionen,® Rogers published four important papers on q-extensions of 
Hermite and ultraspherical polynomials. In his book, Heine included extra material, 
printed in smaller type, on basic hypergeometric series, clearly expecting that the 
q-extensions of some results in the book would be important and fruitful. Rogers 
showed that Heine was right. Rogers was initially struck by a lack of symmetry in 
Heine’s transformation formula. In order to present Rogers’s work succinctly and 
with transparency, we introduce some modern notation different from that of Rogers, 
though he also employed abbreviations. Let 


(x)n = Qn = 1 — x) — gx) — g?x)---(1—q" x), 


(27.14) 
n = 1,2,3,...,00. 


5 Spiridonov (2013). 
© Heine (1878). 
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We observe that Rogers wrote x, for 1 — gral X,! for (x)n; and (x) for (*)o9. We 


now replace g®, gh ,q* in Heine’s formula (27.4) by a, b, c and write it as 


bx az Cc 
(a, bx,cx,q,Z) = (OX) co “24 ( .aaz.br’. 
(CX )o0 (Zoo b 
Here 
(oe) 
(An()n 
o (a,b,c, x) = peree cea 40 (27.15) 
- dX @Dn(On 
After reparametrization, we can write Heine’s transformation as 
b 
sancti ees (F.axab). (27.16) 
(X)o0(C)oo b 


Perhaps Rogers’s earlier work on invariant theory had made him sensitive to symme- 
try, so as a first step he wrote the transformation in symmetric form.’ He observed that, 
by the symmetry in a and b and also by a reapplication of Heine’s transformation, he 
obtained two results from (27.16): 


ct (bx) 00(§) 56 abx c 
o(a,b,c,q,x) = oss(Das. 6(0. ona), (27.17) 
aoe 
gabed Gh S (72 oo o(<f00.%). (27.18) 
(X)oo a b Cc 


Rogers set a = pet? b= ye’, c= py,andx = rel? observing that the last 


three formulas implied that 

WA, M40) = (Ae )oo(MY Joo (we, ye", wg, de") 
was symmetric in A, “, y and in 6 and —@. He then went on to define a g-extension 
of the Hermite polynomials A;,,(@) by the relation 


oe) 


- 1 a An(9) _» 
ee Teed — 2tq" cos6 + t2q2") _ dX @n t. (27.19) 


In a later paper,® Rogers defined a q-extension of the ultraspherical or Gegenbauer 
polynomials and denoted it by L,,(@), using the equation 


P(t) _ r( — 2Atq" cos 6 + aa a - 2 Ly (6) 


= ae 27.20 
Pt) 9 j\ 1 — 2tq" cos6 + t2g2n (Q)n : 


n=0 


7 Rogers (1893b). 
8 Rogers (1894). 
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In connection with the g-Hermite polynomials, Rogers raised the question: Suppose 
ao + a, A1 (0) + a2A2(0) +--+ = bp + 2b, cos8 + 2b2c0s20+---. 


How are the coefficients ap, a1, a2,... and bo, bj, bo, ... related to each other? He 
solved the problem and then applied the solution to the function 


ca 1 
I] (1 + 2q"~2 cosé + ge) . 


n=1 


From the triple product identity, the Fourier cosine expansion of this function was 
already known; Rogers found the expansion in terms of A;(@), and in particular, 
he got 


3 E gq? SaqnntD 
ao = , a= 
“9 Dn 1-q*= @n 


When he expressed ag, a; in terms of the bs, he obtained a series convertible 
to products by the triple product identity. The final results emerged as the Rogers— 
Ramanujan identities. We give Ramanujan’s derivation of these formulas in Chapter 
26. Although Rogers discovered these remarkable identities in 1894, they remained 
unnoticed until Ramanujan rediscovered them without proof. Then, in 1917, quite by 
chance, Ramanujan came across Rogers’s paper while browsing through old journals. 
The ensuing correspondence between Ramanujan and Rogers led them both to new 
proofs of the identity now known by their names. Somewhat surprisingly, around 
1980, the physicist R. J. Baxter rediscovered the Rogers—Ramanujan identities in the 
course of his work on the hard hexagon model.? 

Rogers left unanswered the question of the orthogonality of his g-extensions of 
the Hermite and ultraspherical polynomials. He had found three-term recurrence 
relations for these polynomials, but it was not until the 1930s that these relations 
were widely understood to imply that the polynomials were orthogonal with respect to 
some positive weights. Consider the statement of the spectral theorem for orthogonal 
polynomials: Suppose that a sequence of monic polynomials {P,(x)} with real 
coefficients satisfies a three-term recurrence relation 


X Pax) = Posie) + On Pa(®) + Bn Pn-1(), n 2 I, (27.21) 


with Po(x) = 1, Pj (x) = x—ao, &,_ 1 real and B, > 0. Then there exists a distribution 
function jz, corresponding to a positive and finite Borel measure on the real line, such 
that 


Pin(X) Pa(x) d(x) = fndmn, (27.22) 


9 Baxter (1980). 
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where 


én = Bi B2--- Bn. (27.23) 


The converse is also true and straightforward to prove. This theorem is sometimes 
known as Favard’s theorem since Jean Favard (1902-1965) published a proof in the 
Comptes Rendus (Paris) in 1935. However, the theorem appeared earlier in the works 
of O. Perron, M. Stone,!° and A. Wintner.!! In fact, in 1934 J. Meixner applied this 
theorem to his work on orthogonal polynomials,!* with a reference to a result of 
Perron.!> Stieltjes’s famous 1895 paper on continued fractions also contains a result 
yielding the spectral theorem.!* It is unlikely that Rogers was aware of Stieltjes’s 
work. We note that A. L. Dixon wrote of Rogers that he had only a vague notion of the 
work of other mathematicians. Probably an abstract result such as the spectral theorem 
for orthogonal polynomials would not have interested Rogers who loved special 
functions and formulas. He would have wanted to use the actual weight function, 
needed for computational purposes. 

The Hungarian mathematician Gabor Szegé (1895-1985) took the first step toward 
finding an explicit weight function. Szegé studied in Hungary and Germany under 
Fejér, Frobenius, and Hilbert. His most famous book, Problems and Theorems in 
Analysis, was originally written in German in 1924, in collaboration with George 
Polya. Szegd made significant contributions to orthogonal polynomials, on which 
he also wrote a very influential book first published in 1939. He founded the study 
of orthogonality on the unit circle; for a probability measure a(@), he defined this 
orthogonality by 


Qn 
dn (e!”)dm (2!) da(0) = 0, m #n, 
0 


where 
n 
_ k 
dn(z) = D> agnz*. 
k=0 


Szeg6 was the first to appreciate the general program Rogers had in mind, 
as opposed to its specific though very important corollaries, such as the Rogers— 
Ramanujan identities. Rogers’s papers inspired Szegé to discover the first nontrivial 
example of orthogonal polynomials on the circle, where 


eg age stag") 
ee _1é k ( q 
ee a — aa aa ea = gs 


10 Stone (1932). 

‘1 Wintner (1929). 

12 Meixner (1934). 

13 Perron (1929). 

14 Stieltjes (1993) vol. 2, pp. 628-630. 


68 q-Series and q-Orthogonal Polynomials 


Here note the connection with the Gaussian polynomial. Recall also that Gauss had 
expressed $(—q) and $(,/q) as finite products and evaluated the general quadratic 
Gauss sum from these expressions. Szeg6 found that the weight function f (0) dé = 
da(@) in this case was 


f@a= > gre”, 


n>=—OoO 


and he applied the triple product identity to prove it. The weight function for Rogers’s 
q-Hermite polynomial is also a theta function; the proof of orthogonality in that case 
also uses the triple product identity. 

The qg-ultraspherical polynomials were independently rediscovered by Feldheim 
and I. L. Lanzewizky in 1941. The Hungarian mathematician Ervin Feldheim (1912- 
1944) studied in Paris, since he was not admitted to the university in Budapest. His 
thesis was in probability theory but on his return to Hungary he contributed important 
results to the classical theory of orthogonal polynomials. One of these results was 
contained in a letter to Fejér written by Feldheim shortly before his tragic death at 
the hands of the Nazis. The letter was later found by Paul Turan, who described the 
incident,!> “Thus the letter had been resting among Fejér’s letters for some 15 years. 
... On the next day I received a letter from Szegé, in which he raised just a problem 
solved in Feldheim’s letter! I sent this letter to Szeg6 and he published it with 
applications.” 

The origin of Feldheim and Lanzewizky’s papers was the work of Fejér on 
a generalization of Legendre polynomials; this generalization also included the 
ultraspherical polynomials. Feldheim and Lanzewizky wished to determine those 
generalized Legendre polynomials that were also orthogonal. They used the spectral 
theorem and found conditions under which the generalized Legendre polynomials 
satisfied the appropriate three-term recurrence relation. At the end of his paper, 
Feldheim raised the problem of determining the weight or distribution function for 
orthogonality but was unable to resolve the question. Earlier works of Stieltjes and 
Markov could have helped him here, and from a remark in his paper, it seems that 
he may have been aware of their work. We should remember, however, that Feldheim 
was working under extremely difficult circumstances during the war. So, like Rogers 
himself, Feldheim and Lanzewizky did not give the relevant orthogonality relations 
and, in fact, they did not write the polynomials as qg-extensions of the ultraspherical 
polynomials. Around 1980, Richard Askey and Mourad Ismail finally established the 
explicit orthogonality relation. 

In his paper containing the Rogers—Ramanujan identities, Rogers also observed that 
the series 
2.02, 4,6 
AS es cai ee (27.24) 
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15. Turan (1990) vol. 3, p. 2626. 
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satisfied the relation 
x7) = x(q) = 27q?x07q"). (27.25) 
This can be easily verified and implies the continued fraction expansion 


x02) = 1 Na ie 7¢q3 q+ = 


— 27.2 
x (A2q) 1+ 14+ #14 #14 ( 2) 


This is now known as the Rogers—Ramanujan continued fraction because Ramanujan 
rediscovered and went much farther with it. In his first letter to Hardy, of January 16, 
1913, Ramanujan stated without proof that 


1 —2n ,—4n ,—60 5 5 5 1 = 
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and that 


could be exactly determined if n were any positive rational quantity. In his next letter 
of February 27, 1913, Ramanujan wrote 


[4+ : te Frey] (5 : +e ¥ re) 2 Sees 


2 2 2 
with the condition a6 = m2,.... This theorem is a particular case of a theorem on the continued 
fraction 
1 ax ax* ax3 
[ee aie 


which is a particular case of the continued fraction 


1 ax ax? ax? 


14+ 1+bx+ 14 5x24 14634” 


which is a particular case of a general theorem on continued fractions. 


Hardy was very impressed with Ramanujan’s results on continued fractions. Concern- 
ing formulas (27.27) and (27.28) he wrote,!® “I had never seen anything in the least 


16 Hardy (1978) p. 9. 
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like them before. A single look at them is enough to show that they could only be 
written down by a mathematician of the highest class. They must be true because, if 
they were not true, no one would have had the imagination to invent them.” 


27.2 Heine’s Transformation 


Heine proved his transformation for the g-hypergeometric series by a judicious 
application of the g-binomial theorem.!’ He proved the g-binomial theorem by the use 
of contiguous relations; as we mentioned in Exercise 4.13, Gauss may have employed 
this method to prove the binomial theorem. Heine required the contiguous relation 


1—4q? 
1—q’ 


oa + 1,8, 7.9.x) — (a, B,Y.9,x) = g"x o(a+1,B,y,q,x) (27.29) 


and the q-difference relation 


(l—q*)( ag) 


o(a, B,y,4g,x) — O(a, B,Y,9,9x) = g(a 1,8 ly 1,q,x). 


1—q’ 
(27.30) 
Note that the second relation is the analog of the derivative equation 
d -b 
SR be a). SFO bah's 1), 
dx c 
Heine then supposed that 6 = y = | so that he had the g-binomial series 
1—g® l—@®)1 — a+l 
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Leg (lg) eg") 
By (27.29), 
1 
éd(a+1,x) = o(a,x). (27.32) 
1—q*x 
Combined with (27.30), this produced 
Lege (mgs (ha gtx) (ha gts) Ad 
o(a,x) = l—x o(a,qx) = (= = ¢x)-<— gx) o(a,q Xx). 
(27.33) 


The g-binomial theorem followed, since it was assumed that |g| < 1 and 


o(a,q"t!x) > o(a,0)=1 as n> 0. 


17 Heine (1847). 
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Then, to obtain the transformation formula (27.32), Heine started with the series 


aia MUG ee) 2 lag! ge CE gia ie Ge) 


S=1 i T free, 
G—gd—qrx) | d—-md—qdd—q’ynd—qrtix) * 
(27.34) 


where he assumed |g| < 1, |x| < 1, and |z| < 1 for convergence. He multiplied both 
sides by #(y — B,q’x) and used the product expression for this function, from the 
q-binomial theorem, to get 
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In the next step, Heine expanded each of the ¢ on the right as series. We here employ 
the abbreviated notation given in (27.14). 


(27.35) 
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He changed the order of summation and used the g-binomial theorem to obtain 


y—-B. y—-B. 
$e je OL oh pS bg 
(q3q)1 (4;q)2 
= o(a.ai(1 _@ anwar ee (q”*Psq)2(zq)o- xr), 
(asqilg?zqn (4; 9)2(q%2; q)2 


Finally, Heine substituted this expression into the right-hand side of (27.35) and 
replaced o(a,z) and @(y — B,q’x) by their product expressions, to arrive at 
the transformation (27.4). Heine found the g-extension of Gauss’s summation of 
F(a, B,y,1) by taking x = 1,z = gr oF in his transformation: 


(a, B.y.q”? **) 
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-o(y —a—B,1. 1°) (27.36) 


Note that this derivation is analogous to the method used to derive Gauss’s formula 
from Euler’s integral for the hypergeometric function. This suggests that Heine’s 
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transformation is the g-analog of Euler’s integral formula; indeed, recall that Thomae 
proved this after defining the q-integral. 


27.3 Rogers: Threefold Symmetry 


Rogers applied his knowledge and experience of elliptic functions and invariant 
theory to develop the theory of g-Hermite and q-ultraspherical polynomials. From 
invariant theory, he brought a sense of symmetry and expertise in applying infinite 
series/products of operators. We may recall that in the first half of the nineteenth 
century, Arbogast, Franais, Murphy, D. Gregory, and Boole made extensive use of 
operational calculus. Cayley and Sylvester appropriated these methods for invariant 
theory. Then, in the 1880s, J. Hammond and P. MacMahon applied these techniques 
to combinatorial and invariant theoretic problems. It is interesting to observe Rogers’s 
use of algebra, combinatorics and analysis as he conceived of and solved new 
problems in analysis. In his second paper of 1893, Rogers converted Heine’s trans- 
formation into an equation with threefold symmetry.'* He showed that the function 


WA HV,g,0) = owe"? ve", wv, 4, Ae?) (Ae! oo(HV) 00 (27.37) 
was symmetric in A, jz, and v, and also symmetric in 6 and —@. He then set 


WA, H,v.9,8) 


——___—_, (27.38) 
P(A) P(w) P(v) 


XA, u,V,g,8) = 
where 
= 1 
PA)= 1 —2ag”" cos @ + A2q2")-! = — = 27.39 


Also in his 1847 paper, Heine discussed such products; for particular values of A 
they are ubiquitous in elliptic function theory. Rogers defined a q-extension of the 
Hermite polynomials A, (0) as the coefficient of oy in the series expansion of P(A): 


oe) 


P(A) = Ne a) A (27.40) 
0 (q)r 


From Euler’s expansion, (25.20), Rogers had 
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18 Rogers (1893b). 
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Hence Rogers obtained his expression for the g-Hermite polynomial: 


r 


(qjper s (q)- cos(r — 2n)@ 
BOO) = Do 27.42 
“ X (Q)n(@r—n = (Q)n(Qr—n ( ) 


Note that the last step followed because A;(@) was an even function of @. In his 
1894 paper, Rogers noted the three-term recurrence relation for A,(@): 


2.cos@A,—1(0) = Ar(0) + (1 — q"!)Ar—2(0), (27.43) 


obtained as a consequence of the relation 


P(Ag) = (1 — 2A.cos6 + 42) P(A), Oi) 
or 

5 ONE i Pacey yO (27.45) 

= (qQ)r 0 (Wr 


In his 1893 paper, Rogers raised the problem of expanding x (A, 11, v,g,@) in (27.38) 
as a series in g-Hermite polynomials. The series would take the form Ap + A1 Ai + 
A2H2+--- and the problem was to determine H,.. He showed that H,, a homogeneous 
symmetric function of degree r in A, jz, and v, was the coefficient of k” in the series 
expansion of 


1 
(KA) oo (KM oo (KV )oo ; 


(27.46) 


Rogers gave an interesting proof of this expansion and we sketch it very briefly. He 
observed that for the function x defined by (27.38), 


b,X = Sux = 5x, (27.47) 


where 5 was the difference operator defined by 


5g f(x) = LV =e, (27.48) 
x 
From (27.47), he was able to deduce that 
54 Hy, (A, L, v) = Su H, (A, Lb, v) = 5H, (A, Lb, v). (27.49) 


He denoted the coefficient of A% w/v” in H, by dy,p,). Note thata + B+y =r. 
Rogers next showed that (27.49) implied the recurrence relations 


(1d —@** )au+1.6,y = 1 — 9? *)ae,p4iy =U —9@"*t)ao,p.yt1. (27.50) 
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Combined with the initial condition a;,.9,9 = 7 , the relations (27.50) uniquely 
defined the coefficients. At this point, Rogers remarked that because of uniqueness 
it was sufficient to produce a set of coefficients satisfying these conditions. He then 
quickly demonstrated that the coefficient of k” in the series expansion of (27.46) was 
a homogeneous symmetric function of degree r in A, 4, and v whose coefficients 
satisfied the same initial condition and recurrence relations (27.50). This proved his 
result, though he gave no indication of how arrived at (27.46). 

Rogers noted some interesting and important particular cases of his theorem. When 
d. = 0, he had 


(LV Joo ta 
P(u)P(v) =1+A1@)Mi (Hv) + Ar) Fo (Hv) +++, (27.51) 


where H,.(, v) was the coefficient of k” in 


1 =(14 kw ee \(13 kv Ee a) 
kDekvs toe @s- "l-q) @2- 


(27.52) 
So 
r (Q)r r-1 (Q)r r—2.2 Fs 

A,(u,v) = t tee : 27.53 
Gia = GG ee Oe 

We note that when w = x and v = 1, 
P= pO rate aay (27.54) 

(q)1(@)r-1 (4)2(q)r—2 


The polynomials H,(x) (27.54) are now called Rogers—Szegé polynomials, 
because Szeg6 proved the orthogonality of H,(—~“) with respect to a suitable 


measure on the unit circle. When jz = xe’? and v = xe~'® in (27.51), Rogers got 


oe) 


(7 Yes 
a — 2xq" cos(O + @) + x2q2")(1 — 2xg" cos(@ — @) + x2q2") 


5 AiO) Anlo) An(@)An () an 
(qn 


(27.55) 


n=0 


The result (27.55) is now known as the g-Mehler formula. Rogers also applied (27.51) 
to prove the useful linearization formula for g-Hermite polynomials: 


min(m,n) 


Am(@)An(0) _ 
(Qin(QMn 


Am-+n—2k (A) 
(Q)k(Q)m—k(Q)n—k- 


(27.56) 
k=0 
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Rogers also noted that, by definition, A,(7) was the coefficient of or in [[d + 
k?q?")—!, and hence 


a 24 r 3 r-1 
A,(F) =¢ Lyd aag hea eer (Lag) reven, 
=0 r odd. 


Recall that Gauss evaluated Gauss sums from this result; see the formula (25.36). 
Thus, though Rogers may not have known it, this result is due to Gauss. 


27.4 Rogers: Rogers-Ramanujan Identities 


In order to derive the Rogers-Ramanujan identities, Rogers expanded P(A) as a 
Fourier series and as a series in the g-Hermite polynomials. He then found a relation 
between the coefficients in these two series, yielding the famous identities. In his 1894 
paper,!° Rogers raised and solved the problem: Suppose a function f (0) is expanded 
as a Fourier cosine series and as a series in g-Hermite polynomials A, (0): 


f(@) = ao +a, A1(0) + a2A2(0) +--+ = bp + 2b; cos6 + 2b2 cos 20 +4+---. 
(27.57) 


Express the coefficient a, in terms of a series of b; and, conversely, b, in terms of 
a series of aj. Rogers found that 


bn = an 4 aad Gd ea) 
= (tg) 9) 

Sq ag ag) 
(hg l=@2(l= a7) 
For the converse, he gave ao, a; in terms of series of b, but for the general case he 


merely described the method by which the a, could be obtained for higher values of 
n. He had 


an+4 
(27.58) 


r(r—l) 
ag = bo — (1+ 4)b2 + q(1+q*)ba—---+(-1)"'q 2 (+q")by +..., 
(27.59) 


(1 —g)a; = (1—q)b1 — (1 — q3)b3 
Ey =@bpan Gg 2 C9 bia C100) 


The derivation of (27.58) was simple. Rogers substituted the expression (27.42) for 
An(@) in terms of cos k@ on the left side of (27.57) and then equated the coefficients 


19 Rogers (1894). 
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of cosné@ on both sides. He noted the formula only for even n, but the method also 
yields the case for odd n. Rogers’s method for finding (27.59) and (27.60) was quite 
elaborate and he did not give the general formula. In his third paper of 1893, on the 
expansion of infinite products, Rogers supposed 


f (0) = Co + CiA1(@) + CrA2(6) +--+ 
to be given. He then asked how to find Ko, Ki, K2,... in the expansion 


f@) 
= Ky + KiA\(@) + K2A2(0) +--+. (27.61 
Teo — 2g” cos 6 + 42q2") o+ Ki Ai1(@) + K2A2(9) + ( ) 


Rogers expressed the result symbolically, in terms of the difference operator 


P(A) — Aq) | 
ia cn 


(CoC RS Ch? +2). 2762) 


5,G(A) = 
= 1 
GE (803, Veg 


Note that he used an infinite product in the operator 4). 
Also in his 1894 paper, Rogers applied (27.62) to find the g-Hermite expansion of 


[o,@) 
P(A) = [ (1 — 2A9"coso + 479”). (27.63) 
n=0 


He noted that for an analytic function @ 


: o(A) = : 
(x82,)o0 Obes 


p(x). (27.64) 


He verified this by taking the special value @(A) = 4”. Since 


1 ey ey eae 
(x8))oo i (q)n a On 


oe ae ae) 
(q)m—n 
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Therefore, (27.62) could be rewritten as 


1 1 
Kot Kix + Kox?+-.-= ; Cg Cie HC ase): 
0 1x 2x Oa) (doo 0 1x 9x ) 


or Co + Cyx + Cox? + +++ = (Abx) oo (AX)oo(Ko + Kix +---). (27.66) 
Rogers then argued that if f(@) = PQ) = Co+ CiA1(6)+---, then 
Ko + Kj Ai) +--+ = 1; 
substituting in (27.66) gave him 
Cot Cix + Cox? +++ = (8x) 0 (AX) o0 


=(1 rSx qh? 62 )( AK, ghex ) 
~ l-q (l-g-q) *" l-q' d-qga-q@) 7} 


(27.67) 


The last step made use of Euler’s special case of the g-binomial theorem, (25.19). 
Rogers replaced 4 by x (27.65); he then applied the first expression in parentheses 
to the second expression in parentheses, in the right-hand side of equation (27.67), to 
conclude that the coefficient C, of x” was 


aise 1) 


(-1)"q ar o° 25 grsts(s— 1) 


@ “ @s 


Ss 


This gave the g-Hermite expansion of P(A). Changing A to —Aq yielded 


(1 + 2agcos6 + A7q7)(1 + as cosé + A2q4)--- 


g®A 
= x0?) 4 x 0°q)A1(0) + — fais M x02q?An(0) + 2 ce ao 102g3)A30) + 
(27.68) 
where 
2ied, 4,6 6,12 
yO cress | Mq ; a 


l-q' (@-gd-¢@) ° G-@d—42)d —q) | 
(27.69) 


From the triple product ice Rogers knew the Fourier cosine expansion of the 
product in (27.68) when A = Fi 
(1 +. 2q2 cos6 + q)(1 +242 cos0 +q3)--- 


1 
= — (1 4.242 cos 0 + 2q? cos 26 + 2q? cos 36 + vee), 
(Yoo 
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He therefore set A = a in (27.68) and applied (27.59) and (27.60). Thus, the Rogers— 


Ramanujan identities were discovered. From (27.59) he obtained 


w=x(=)=15 es q q 
q l-q (d-@gd-q) (-q@gd-@ci 


1 
= (1 d+ge?+0 492-1 a+) 
(q)oo 


1 = (Sm+1) 
(1 tS (-1"q ) 
m=1 


Qs 


The sum could be evaluated by the triple product identity 


[e,2) 
qd = hi n(n—1) 7 
(eo(2) @Me= DO Cpa x" 
n=—-CO 
Replacing q by q° and setting x = q* yielded 


oe) 


(973.4? )00(97: 9") 00(9°5 oo = 1+ D(-1)"q 


n=1 


n(5n+1 
2 


So the right-hand side of the equation (27.70) was reduced to 


(l—q?)Q—q’)---d—@)d —48)---d—9*)d —q!)-:: 
(1-4) -— 4?) —q3):-- 


1 oe) 


1 


q°) 


(27.70) 


n=0 
This completed the proof of the first identity. Similarly, 


q° q® qi? 


= gd-q)d-4%)d pete | ero eros 


1— = 3 14 
( qa, =4 ( l—q d—qg)d—- q?) d-q)d ql 


J , 3, 3 6, 25 
= (—q)q2-(—-—q°)q2+(q-q)q2 --:: }. 
(4)oo 


In this way, he obtained the second identity 


1- q | q° aa er 
‘1l=q) (l-q)(—-4?) — 

1 
= 1 aa 
He q-q +4 


oe) 


1 
= I i= go"t2)(1 om qont3)’ 


q?) 


(27°71) 
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To prove (27.62), Rogers observed that since 


1 wd at A) + ? 
Tip — 24g" cos6 +42g2")  1—qg |’ dq) —@) 


ARG) ps8; 


the linearization formula (27.56) for A,(@) implied that the left-hand side of (27.62) 
was a product of series in A,(@) and therefore was itself such a series. He also 
observed that the linearization of Am(@) An(@) contained a term independent of 0 
only when m = n, and in that case this term would be (q),. Hence, he had 


ko= Cot Gib Gai ee, (27.72) 


From the difference relation 


5 1 _ 1 1 1 
1c 7 Ata saa) 


ee ae Cl — 2A cos@ + 7) y 2cos@ —iA 
- A P(A) ~ P(A) 


’ 


Rogers obtained 


5, (Ko + Ki A1(@) + K2A2(0) +---) = (Co+ CiAi+::--) 
= (2cos6 — A)(Ko + K1A1(@) + ---). 


He equated the coefficients of A, on both sides and applied the recurrence relation 
(27.43) 


Aryi + (1 —q")A;—1 = 2080 A, 

to conclude 

(a @ Kp = Kp — Kp + 8 Kp (27.73) 
He showed inductively that (27.73) implied 

K, = H,(A,4)) Ko, (27.74) 

where H, was defined as in (27.53). For r = 0, (27.73) gave him 

Ud —@)Ki = (A+ 6,)Ko = AA, 6) Ko. 

He assumed the result true up to r so that (27.73) could be written as 


(1 —q!*") Kyi = (A +6x) Hy (2,8) Ko — Hy—1(0,8)Ko. (27.75) 
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He noted that when m +n = r, the coefficient of 4’"6) in H(A, 6),) was 


1 
(Q)n(Q)r—m* 
He defined the operator 7 by nf (A) = f(gA), so that 


5, A™ SE F(A) = A™ 18 F(A) — A™ 1 g™ dt f(A) 
= A188 F(A) — (a8? f(A) — amet! fF (A)) 
=(1—g™)am—13" fa) + g@mamant? f(a). (27.76) 


The first term in the expression in the third line of (27.76) cancelled with the term 
containing 4”"—!3” in 


A—1(A, 6). 
So (27.74) implied that 
(—@"*)Kry1 = Bi a 4 Janae Ka 
r — T 
(Q)m(Q)r—m (Q)m—1(Q)r—m4+1 ms 
ras aa m 


=(1—q'*! 
q Wee. —m 


= (1 —q"*!) Ay 41(A, 6.) Ko. 


Thus (27.74) was proved. Moreover, by (27.72), Rogers had 


Cot CiAi+C2A2+-::- 
PO = (1 + Ay(A,5))A1 + Ho(A,6,)A2 +--+) 


(Cp CpACn he 


This meant, remarked Rogers, that Ko + Kix 4 Kox? +--+ was equal to 


(14x Hy (A, 5) + x7 Hy(A,5,) +++ )(Co + CA + Cpa? +--+) 
1 , (27.77) 
= (Co +t CyrA + Cod +---). 
(Nese 


This completed the proof of (27.62). 


27.5 Rogers: “Third Memoir” 


Rogers defined qg-ultraspherical polynomials and derived some of their properties in 
his 1895 “Third memoir on the expansion of certain infinite products.””° His definition 
used their generating function 


x", (27.78) 


20 Rogers (1895). 
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He obtained the recurrence relation 


L, —2cos0 - Ly-\(1 —Ag”™!) + Ly-2(1 — q’')(U. — 07q"~*) = 0 


from the equation 


P(x) 
P(x) 


P(A 
Go bieere 85 ae): (27.79) 
P(qx) 


He observed that by the g-binomial theorem, 


oe) 


(Axel?) oo _ On n_ing, 
Cae gp ro - 


an expression for L;,(@) could be obtained by multiplying this series with the series 


for 
Cue Yes 


(xe!) 09 


He also noted the following particular cases: when 


A=0, Ly = A;; 
sin(r + 1)0 
Masta le sin 0 


1—ag’ 
ees 4 1, > (@)-2c08r6. 


For yet another noteworthy result, Rogers supposed M,. to be the same function of 


was L, of A: 


M,x" 


P(x) 
=14 ; 
P(x) B (Q)r 


Then 
Le y My—25(1 — g’~*5w) (u— AU gd) (UT! A)r=s 
(q)r—2s (q)s(M)r—s-+1 ; 


(q)r v<s<5 


(27.80) 


Secondly, Rogers gave a formula now known as the linearization formula: 
Fors <r, 


rhs _ yn Aa Rg) bres 2451 ONO AA Irts—1 7 gy 


(q)r(q)s 10 (07) r-+s—21(q)s—t(q)1(Q)r—1(A)r+s—t41 
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Note that the modern definition of the g-ultraspherical polynomial C,,(cos @;A|q) is 
slightly different: 


P(x) 
P(Ax) 


CO 
=) Cy(cos 6; Alq)x”. (27.82) 
n=0 


27.6 Rogers—Szegé Polynomials 


Rogers found q-extensions for two systems of orthogonal polynomials, but he did not 
prove their orthogonality; Szegé was the first to take a significant step in that direction. 
He considered polynomials orthogonal on the unit circle. In a 1921 paper he showed 
how to associate polynomials orthogonal on —1 < x < 1 of weight function w(x) with 
polynomials orthogonal on the unit circle of weight function f(@) = w(cos @)| sin 0. 
However, Szegé had been able to find only a few simple examples; in 1926, Rogers’s 
work motivated him to discover that the polynomials”! 


n 


i (Qn yee Bie 
ONS ay (Q)k(Q)n—k ae 


where 
wie 
ACD) 


were orthogonal on the unit circle with respect to the weight function 


f@) = |D(e®) |? and D(z) = V/@)eo (—4 22)o0. 


Szeg6 proved this in his paper “Ein Beitrag zur Theorie der Thetafunktionen,” by 
first observing that by the triple product identity 


love) ae 
fO= Do are”; 
n=—CO 


hence 
1 20 


Ee n2 
eek f@e dd=gq7z, n=0,£1,+2,.... (27.83) 
20 0 


He then took ¢,(z) = rE0 axz* to determine ax, by requiring the relations 


1 20 


= f (0) dn(z) dO =0, z=e?, k=01,...,n—-1. (27.84) 
20 0 


21 S7eg6 (1926) or pp. 795-805 of Szegé (1982). 
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(s—k 2 


By (27.83), equation (27.84) gave )~”_ asq° =0Oor 


n 2 
Sage? =O. k= 01h aan 1, (27.85) 
s=0 


To solve this system of equations, Szegé recalled the Rothe—Gauss formula (25.13): 


n 


(qn s(s+1) 
(l+qx)+q7x)---d+q"x) = )° —*"—q7 x’. 
2 (q)s(G)n—s 
He took x = —q~*—! to get 
“. q) 14 3? 
So (Eg ot? 30, k=01,....n-1. (27.86) 


= (Q)s(QMn—s 


By comparing (27.85) and (27.86), he concluded that 


(Mn _s 


ay = a(—1)* ————-q7"7, s=0,1,...,n. 
: @s(Qnos- 


The factor a was then chosen so that 


20 


1 
xc | FO Ibn ae = 1. 
a JO 


We note that the triple product identity can be applied to also obtain the orthogonality 


of the g-Hermite polynomials A,(@). The orthogonality relation here would be 
given by 


Simn 


— 27.87 
(ge 


/ Am (8) An (8) (2 )ool2d0 = 
T JO 


27.7 Feldheim and Lanzewizky: Orthogonality 
of g-Ultraspherical Polynomials 


The work of Feldheim?” and Lanzewizky7® arose out of the papers of Fejér and Szegé 
on some questions relating to generalized Legendre polynomials. To define Fejér’s 
generalized Legendre polynomial, let 


f(z) =ayptayztanz?4+--- (27.88) 


22 Feldheim (1941). 
23 Lanzewizky (1941). 
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be analytic in a neighborhood of zero with real coefficients. Then 


(oe) oe) 
| f(re!®)/? = Se aire ware 
n=0 n=0 


CO n 
= >» r” So dn—Kak cos(n — 2k)0. 


n=0 k=0 


The last step is valid because the left-hand side is real and the coefficients are real. 
The polynomials p, (x) defined by 


n 
Pn(cos 0) = So akan—x cos(n — 2k)0, (27.89) 
k=0 


where x = cos@, are the Fejér-Legendre polynomials; they have properties similar 
to those of Legendre polynomials. For example, Fejér and Szegé proved that under 
certain conditions on the coefficients, p,(x) had n zeros in the interval (—1, 1) and 
that the zeros of pn(x) and py+1(x) separated each other. Feldheim and Lanzewizky 
showed that these polynomials were orthogonal when f(z) in (27.88) was the g- 
binomial series. Their result was stated in a different form, and they did not give the 
orthogonality relation. Feldheim used the theorem: For a sequence of polynomials 
Pr(x) (n = 0,1,2,...) to be orthogonal, it is necessary and sufficient that the 
recurrence relation 


2byX Py(x) = Pn4i(%) + AnPn-1(x) (n= 1An > 0) (27.90) 


hold true. Feldheim substituted (27.89) in (27.90) and applied the trigonometric 
identity 2x7, (x) = Ty41(%) + Ty-1(x) where T(x) = cosk@, x = cos 6. He then 
wrote (27.90) as 


n 


bn > akAn—k (Trae (x) + Th—2k-1 «)) 


k=0 
n+1 n—-1 

= \) apn 441 Trang (©) + dn D> OkGn—K—1Tn—-2e-1 2), 
k=0 k=0 


or, with a_; = 0, 


n 
bn Saran k + ak—14n—k41) Tr—2k41(%) 
k=0 
n+l 
= So (akan—e41 + Andk—14n—k) Tr—2k-41(2). 
k=0 
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By equating coefficients, 
By (kan—k + Ak—14n—k-+1) = Akan—k-+1 + Andk—14n—k- 
Dividing by ax—jay_, and setting b, = ae Feldheim obtained 


An = bn (be-1 + On—k) — De-1bn-k, = 1,2,...,n. (27.91) 


He considered the three equations, obtained when k = n,n — 1, and n — 2: 
An = by (bn-1 + bo) _ bn—1b0 = by (bn-2 + bj) aa by—2b1 —= by (bn-3 + b2) a by—3b2. 


He initially set bop = 0, since bo could be arbitrarily chosen, and solved for by to 
obtain 


biby_2 bobn—3 


a = (27.92) 
by + bn-2—bn-1 2 + bn—3 — bn-1 


bn 


With n replaced by n — 1, (27.92) gave 


by bn—3 


by-| = : 
ae by + bn—3 — bn—2 


Solving for b,_3 produced 


bn-\ (bn—2 = bj) 


b,2= 
wa by—1 — by 


and therefore 


bn-1 (bn—2 aa by—1) 


bn—3 —_ Dn-1 ax Db : Dy 
pap = 


and then 


_ b2bn—1(bn—2 — b1) _ b2bn—1(bn—2 — 1) 

© by(bn—1 — b1) + bn—1 (bn—2 — bn=1) bn —1 (2 + bn—2 — bn—1) — bib2 
Ae by bn—2 
by + bn-2 — bn 


bn 


By simplifying, 


(b2 — by) (bp—2 — bn—1) bn—1bn—2 = b1b2(bn—2 — bn—1) (bn—1 — 61), 1 = 3,4,5,.... 


From these relations, Feldheim expressed the value of by, in the simpler form 


bibs 


by, = ee MS NB ends 27.93) 
"bib — (bz — b1)bn-1 ( 
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Considering the general case where bo was not necessarily zero, he noted that since 
bo and b, could be arbitrarily chosen, all the by in (27.93) could then be replaced by 
b, — bo. Feldheim then rewrote this relation as 


Cc 1 
ay aaa eg Bo = 1, By =1,n = 1,2,3,... (27.94) 
where 
by, — bo bz — bo 
Op ot 
To solve the Riccati difference equation (27.94), Feldheim expanded 5, = 1 — Pn 
as the continued fraction 
1 1 1 
a Rn ae 62). Oda SO 
ioe {=12. t=? 
so that R, and S;, satisfied the recurrence relation 
1 
th = th_-1 — —th-2 (27.95) 
Cc 


with initial condition Ro = 1, So = 1; Rj = 1—- i, S; = 1. The linear equation 
(27.95) was solved by the quadratic 
JetV/co—4 

2/ce ; 


For real solutions, it was required that c > 4 so that Feldheim could set c = 4 cosh? é, 
Then 


1 
x*>—x+-—=0 toobtain x= 
Cc 


oe (1 + tanh €)"+2 — (1 — tanhé)"t? 
oo 2"+2 tanh & : 

as (1 + tanhé)"*+! — (1 — tanhé)"t! 
a 2"+1 tanh & 


Substituting for 6, and B,, he arrived at 


sinh(n — LE 
sinh(n + 1)é’ 


bn = bi + (bi — bo) n=0,1,2,..., (27.96) 
where €>0, and bo and b; were arbitrary. Feldheim applied this to (27.91) to 
show that A, >0 if b1 > bo. Thus, he found the orthogonal generalized Legendre 
polynomials. He also observed that he could obtain the ultraspherical polynomials 
as special cases. 

Lanzewizky’s paper was very brief and gave only statements of his results. In his 


first theorem, he noted that if C, = re c, then C,, satisfied the difference equation 
re 
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ol ote oF atte 
a ea (27.97) 
Cee OH er4 


He presented the solution to this difference equation as 


Un—2 (§) 


Cn4i = C2 + (C2 — C1) 4 (27.98) 
mv Un () 
where 
1 /C3-—C, sin(n + 1) arccos & 
== d JU, = : 27.99 
5 2\V C3 —Co se nS) sin arccos & ( ) 
For orthogonality, he required either that € > 1 and —& < x < 1; or that =in 


Ci 


with n > 0 and —1* < 3C5 


not possible. 
Askey has pointed out?*+ that it is more convenient to write the solution of the 
difference equation (27.97) as 


< 1; he observed that for 0 < € < 1, orthogonality was 


= n-1 
ge (27.100) 
1—q" 
where a, # are real constants and |qg| < 1. For |g| < 1, we get 
1— B)(1— Bq)--- — Bq"! 
MO). Cy Cy =a ELBE BOOB) 7 soy 
ag (gig leet?) 
and hence 
f (re!) 206 py (B)n atten? 
(Qn 
where |ar| < 1 for convergence. Orthogonality is obtained if 
(1 —q"*!)(1 — p7q") 
i ar eg 0. 
(i= Sg) Cfq"":) 
So one may take a = 1, yielding 
n 
pn(cos0) = )~ PEO nk room — 240, (27.102) 
ka0 (Q)k(Q)n—k 


the g-ultraspherical polynomials denoted by C,,(x; B|q) with x = cos 6. 

In 1977, Richard Askey and James Wilson derived explicit orthogonality relations 
for some basic hypergeometric orthogonal polynomials.*> These relations included 
the orthogonality relation for the g-ultraspherical polynomials of Rogers. But it was 


24 Andrews et al. (1999) pp. 336-337. 
25 Askey and J. (1985). 
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later, upon reading Rogers’s papers, that Askey recognized the full significance of the 
q-ultraspherical polynomials. Jointly with Mourad Ismail, he worked out the proper- 
ties of these polynomials and discovered various methods for deriving their orthogo- 
nality relation:7° 


: dx 
[ coceoia) Cin (x3 BIg) wp (x) Ji 
(27.103) 
_ 2n(1—B) (B*)n_ (B)o0(B)o0 


= bmn, O<@q <1, 


l= qe (Qn esl Qin 


where 

Coded On micah 
(Be!) o (Bem og 
Interestingly, one of these methods employed Ramanujan’s summation formula, 


paralleling the use of the triple product identity in the derivation of the orthogonality 
relation for the g-Hermite polynomials. 


1<B<l. (27.104) 


wp (cos é) = 


27.8 Exercises 


(1) Show that 


1 | Z | z | — 1 | me | x? | 
l—-x  l—qx | 1—@q2x ' “T=2° Lge" f—92z 
See Heine (1847). 
(2) Show that 
n—1 
I] Q (a". a- =) = c Q(q,na), 
m=0 
where 
_ (d-@)a—-9*")-9™")---)" 
(1 — q)(1 — g*)(1 — g3)- 
See Heine (1847). 
(3) Let 


e@)=[[a-a"tytda-¢qty 1, 
n=0 


¥@=[[a-¢")'a-9¢)"7 
n=0 


26 Askey and Ismail (1980); Askey and Ismail (1983). 
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and then prove that 


n2 


_ ~ 1 2ny) . . ee ee 
(a) 6(q) at +q") 2 7a as 


n=0 
(e“e) (eve) n2 
b aid 1—g2-y. q 
(b) $(q*) II aes | Si 
i 1 2n ~ Gee 
cova =T]¢ +4q BP ere 


See Rogers (1894) pp. 330-331. 
(4) Following Rogers, define B,(@) by 


[o,e) [o,2) 

B,(@ 
[[a +2xq" cos6 +x7q7") = ye Bal) Jn 
n=l n=0 (Q)n 


Demonstrate that 


(a) 
n 1—gq" 
Bon (0) = ght) nae (1 | = aa . 2g cos 20 
_ d=gd=as) i 
© (l= qt = qr*?) -2q" cos46 + -- ). 
(b) 


(q)2n+1 
(Q)n(Q)n+1 
(1—q")( —q""!) 


| ‘es 6 eee 
| (= gD — ght3) 2q° cos 50 + ) 


1 _ 7 
Bonsi(6) = gat (2 cos 6 + it . 2q? cos 30 


(c) 


See Rogers (1917) pp. 315-316. 
(5) Replace 2 cos 2k@ by 


k(k—1) 


(-1'd+¢@5q°2 


in the expression for B2,(@) in Exercise 4(a) and denote the result fo). 
Likewise, let 62,4 1 denote the result of replacing 2 cos(2k + 1)@ by 


k(k—1) 


(-1)F 1 — gq" 
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in 4(b). Show that 


(a) Bong = g™ td — q7"t!) Bon, 
ie _ qt 
(b) Bon+2 = q" ‘Ta gra Bart: 


In Exercise 4(c), equate terms containing even multiples of 6, and replace these 
cosines as indicated earlier in this exercise. Show that this process leads to 


1 
1 2 ki 
Gan ‘ 


Boq7' , Baqe? , 

(go (qa 

| ee a 
1-q' @2 (@)3_ 


Prove that the cosines of odd multiples of 6 lead to the second Rogers— 
Ramanujan identity. See Rogers (1917) pp. 316-317. In the 1940s, W. N. Bailey 
elucidated the underlying structure of Rogers’s method. In the 1980s, 
G. E. Andrews developed Bailey’s idea into a powerful tool to handle q- 
series and mock theta functions. Andrews named this method Bailey chains 
and around the same time, P. Paule independently realized the significance of 
Bailey’s method. See Andrews (1986b). 


(6) Show that 


Lg 


x74 


(1 — 4°) 


(i) - 
I] _ gn fos 
a 1—q"x l-x 


4 . 


tq: 


(1 — x). — qx) 


(x — q)(x — 4”) 


tq 


d—x)d 


qx)(1 — q?x) 


7 1 
Consider the cases x = 0, x = q2 and x = —1. Prove that 


See Rogers (1893a) p. 30. 


c@tDgn@+) 


ye 


al 


1— cq" 


27.9 Notes on the Literature 


See Andrews (1986b) for an interesting and detailed discussion, with good references, 
of the work of Heine, Thomae, Rogers, and Ramanujan. 


28 


Dirichlet L-Series 


28.1 Preliminary Remarks 


A Dirichlet L-series can be seen, in general, as a series taking the form 
[e,2) 
m=1 


where |a@m| = 1 or 0 and dmin = Gm for some positive integer n; also, it must be 
expressible as the product over primes (known as an Euler product): 


[]a- app) 


P 


ls 


a 
s? 


3 


Dirichlet studied L-series in the context of proving the existence of an infinite 
number of primes of the form a + nd, with a and d relatively prime integers. Euler 
had earlier verified this result for primes of the form 4n + 1 and 4n + 3. 

In 1739, as we discuss in Chapter 16, Euler summed the infinite series 


fs 1 | 1 et es 1 oe (— 1-1 27k 197 2k Boy 
5 22K" 32k ST 2k TO (2k)! 


(28.1) 


as mentioned earlier, this series is written in modern notation as ¢(2k), with ¢(s) 
defined by 


1 
CG) Shea see baa at (28.2) 


for s > 1 or, with s a complex number, Re s > 1. 
In 1737, Euler had shown! that (28.2) could be expressed as the infinite product 


EO) = [aap S02) as ase dar) a 28.3) 
P 


! Bu. 1-14 pp. 217-244. E 72, Theorem 8. 
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where s > 1. Next, apply the geometric series expansion 


=a Saar a 


to each term in the product (28.3) to arrive at 


. 1 , 1 1 1 1 1 1 
Lise pga gag Degg aoe bela isieog ee 


Ge Wes, Ah, Sl 
=l+-4+54+5+5+amt' 


28 35 22s 58 28 . 38 


and, by the unique factorization of integers, 


=C(s) for s>1. 


Euler later presented several other similar series converted into products over 
primes,” now called Euler products. For example Euler gave: 


is ace t\= le ie ge 
a(bee) se) a) ae 


and 


35 5 75 Qs 1 1° 135 158 


iyo Py io 1 \7! (ee 
=a ere ieee {she —— | eae eer een 
Ga ee) ee) eae) eae) 


(28.5) 


We remark that Euler wrote the formula (28.5) for s = 1, since he knew that for 
Ss = | its value was a" Recall that Newton had given this series and its value in his 
second letter to Leibniz. 

Observe that in the series in (28.4), all multiples of 3 are missing and the 
coefficients of 4~*, 7~*, 10~*%,... are all +1, while the coefficients of 2~*, 5~*, 
8-*,... are all —1. Thus, every term of the form (3m + 1)~* has +1 as coefficient, 
whereas every term of the form (3m — 1)~* has coefficient —1. Note also that the 
product of two numbers of the same form is of the form 3m + 1, while the product of 
two numbers of different forms is of the form 3m — 1. Numbers of the form 3m — | 
can also take the form 3(m — 1) + 2. If we let [1] and [2] denote the numbers of the 
form 3m + 1 and 3m + 2 respectively and if we associate with [1] and [2] the integers 
1 and —1 respectively, then we can write 


WU)=1 and y(2))=—-1 


2 Euler (1988) chapter 15. 
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and 


(Inn) = v(m) y(n). (28.6) 


Thus, denoting the integers modulo n by Zy and the integers prime to n by Z**, we 
see that (28.6) defines a mapping 


wi ZX > {1,-Y (28.7) 
where 
wv ([ab]) = ¥ (Lal) ¥([5)). 
The function y can then be extended to Z,, by setting 
v([a]) =0 if (an)=d>1, 


where (a,n) is the gcd of a and n. 
With this notation, we can rewrite (28.4) as 


ie) -1 
> ae =|] (1 - x0”) (28.8) 
P 


n 
n=1 P 


where the product is over all primes and 
x: Z3 —> {0,1, — 1}. 
Clearly, for (28.5) the function x is defined on Zg. Thus 


x2) = x4) = x6) =--- =0, 


while 
x(8m + 1) = x(8m+3)=1 
and 
x (8m — 1) = x(8m — 3) = -1. 


We now see that (28.5) may be written in exactly the same way as (28.8). There are 
other series for which x can be defined modulo 8. For example, Euler also gave 


(28.9) 
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Two other series of this kind, expressible as products, are: 


1 1 1 1 1 1 
1 35 55 58 gs 11° 135 ae 
1 1 1 1 1 1 


1+ 


35 55 75 gs 115 135 


It can be shown that there are exactly four series that can be expressed as products 
in this manner; this is because Zs has four elements. 

For his work on primes in arithmetic progressions, Dirichlet had to consider 
complex-valued multiplicative functions x, while Euler discussed only those series 
in which the value of x was +1. Since | + 1| = 1, Dirichlet considered those complex 
x(m) for which |x (m)| = 1 and m € Z*. As a simple example, consider the case in 
which n = 5, that is where x is defined on Les, Observe that 3 generates Zz since, 
modulo 5,3! = 3,33 = 4, 3° = 2, and 34 = 1. Since x is multiplicative, it is sufficient 
to define x (3). Clearly, x (3) is a root of the equation x* = 1; the four possible values 
of x (3) are thus +i, + 1. The four series corresponding to these values are then 


l-s+o-s ston gte. (28.10) 


1 . eater eey (28.11) 


2s 3 Ss 4s 65 75 85 Qs 


Observe that (28.10) can be written as a product 


»~\—l »\ —l -\—l 
ee ly eee as cree 
28 35 7s if 


where the prime 5 is missing from the product, just as it is from the series. 

Dirichlet denoted series such as (28.8) through (28.11), when written as products, 
by the letter L, so we now call them L-series.? One notation for these series is L(x,5), 
where x is a multiplicative function 


x: Z* > 7, 


where I" denotes the complex numbers z, with |z| = 1. 


28.2 Dirichlet’s Summation of L(1, x) 


The series known as the Madhava—Leibniz formula, 1 gt5g-c= 7» discussed in 


Section 16.1, gives the value of L(1, x) for the nontrivial character modulo 4, defined 


3 Dirichlet (1969) vol. 1, pp. 317-318. 
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by x (4n+1) = +1. Euler employed this series to prove that primes of the form 4n+ 1 
and of the form 4” — 1 were both infinite in number. In the 1830s, J. P. G. Lejeune 
Dirichlet went further, giving a general evaluation of L(1, x) to prove his results on 
quadratic forms and on primes in arithmetic progressions. Dirichlet first examined 
the case in which x was a character modulo p, where p was a prime. We limit our 
discussion to this simple case. Dirichlet defined this character by taking any generator 
g of the cyclic group consisting of the integers modulo p without the zero element. 
Next, he let w be any (p — 1)th root of unity. For n not divisible by p, he set 


x(n) = w’" where g”” =n (mod p). 


By convention, x (mp) = 0. More details on Dirichlet’s theory of characters are given 
later in this chapter. To evaluate the L-series )°”° , we ats = 1 whenw ¥ 1, 
Dirichlet first expressed the series as an integral. Note that the terms in which n was 
a multiple of p were taken to be 0. In a paper of 1768 on series related to the zeta 
function,* Euler had used the idea of expressing this type of series as an integral. Like 


Euler, Dirichlet started with® 


i, nol (tos -) pea (28.12) 
0 x ns 


From the periodicity of the character w”, he had 


wn 1 1 
L = Vk fetes 
we a -Sw (; (K+ py | (+ 2p ) 
= wrk / 5 a (1 ~) dx 
re 2 dX f ae 


=| s—l1 
— : Yk (oe ~) dx 
T'(s) P x 
xT! s—l1 
a f@f,1 
= FO af? eT (1 og) dx (28.13) 
where 
f(x) = 2 wink, (28.14) 


Unlike most eighteenth-century mathematicians, Dirichlet dealt carefully with 
convergence and term-by-term integration, so he summed only the first (p — 1)h terms 
of the series and then showed that this sum differed from the integral (28.13) by an 


4 Bu. 1-15 pp. 91-130. E 393. 
5 Dirichlet (1969) vol. 1, pp. 313-342. 
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integral with the limit zero as h — oo. When s = 1, the logarithmic term in the 
integral (28.13) vanished. Thus, Dirichlet observed that L,, (1), an integral of a rational 
function, could be computed in terms of logarithms and circular functions. 


2mri 
Dirichlet pointed out that the factors of x” — 1 were of the form x —e ? ; hence, 


p— 2mmi ” 
4 1 m=1xX —e P 
where 
2mmi I 
(x-e P jee f(x) 
Am = lim ; 
mt 2mni xP—1 
xe P 


-1 2mxi 
This limit was the value of os atx =e Pp ; thus, he found 


1 2mzi ae 2kmsxi 
An =—f(e™ )=—you%e mr. (28.15) 
P Pl 


Next, Dirichlet set km = h (mod p) so that w”% = w~”" w”* and 


Imai as, hi ni 
f (c Pp ) =w ™ So wre P =y ™ fle p ). 
h=1 


In this manner, Dirichlet arrived at 
00 pel 1 
win 1 2ni dx 
L,(1) = ) See “P y ~Y¥m —__.. 28.16 
w( ) am n pi’ 2h, w i 2mni ( ) 


He next noted that the last integral could be expressed as 


_ 2mxi . mm I 2m 
logil—e ? ) = tog (2sin ) + I (1 i) 
Dp 2 


D 

so that 
eS Yn 1 2Qni p= 2 
ae, ("') worn (108 (2sin ~*) Le (1 *)). (28.17) 
n=1 e P m=1 P 2 P 


Dirichlet further observed that this formula took a much simpler form when 


w = —1. This corresponded to the quadratic character (—1)”" = (4): he then had 


3 (2) LL ye F (2) (te (2sin™) 41% (1-2) 
p)n- p Li \p) py eee 


n=1 
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Since eae (3) = ae (—1)’" = 0, he could simplify to obtain 


le) _ p-l 
> (=) = S296 55 y: (“) (ios (2sin “= a im) . (28.18) 
p/n P ZN P P 


n=1 


He then noted that 


ow eee oe Cee oac. 


using plus if p took the form 4n + 1 and minus if p took the form 4n + 3. Note 
that when p is of the form 4n + 1, the imaginary part of the sum vanishes because 


ym (“) = 0 when (2) = (2). Dirichlet could then conclude that 


a\1 1, 2 Usin® 
(=) —=-—f(e? ) log Wsn 2’ (28.19) 
Pp 


“8 


n=1 


where a represented quadratic residues (mod p) and b nonresidues. Observe that for 
the case p = 4n + 3, 


(=) toe (2sin “= a (? = ") log (2sin one). 
P P P P 


and hence the sum of these terms is zero and 


> (5) : == pe?) (Sra- 6) Var. (28.20) 


nal ©? P 


2ni 
Moreover, the term f(e ? ) is the quadratic Gauss sum 
po! p-! 
kai k\  2kxi ,p=4 1, 
Yipee? = (=<) Cay) ee 
= i./p, p =4n + 3; 


in this connection, see Section 19.7 or Section 25.6. We thus obtain Dirichlet’s final 
formulas® 


3 Be) isi peg = | (mod 4) (28.21) 
L\p) in Vp Wisin a= : ; 
CO 

n\ 1 4 

—)-= ——_— b— a), p=3(mod4). (28.22) 
EG) e gp 2) 


Dirichlet wrote that the last formula implied that for primes of the form 4n + 3, 
>> b > Ya, that is, the sum of the quadratic nonresidues was greater than the sum of 
the quadratic residues, and that it would be difficult to prove this in a different way. 


6 ibid. p. 327. 
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28.3 Eisenstein’s Proof of the Functional Equation 


The discovery of Eisenstein’s proof of the functional equation began in 1964 
when B. Artmann came across Eisenstein’s old copy of Gauss’s Disquisitiones in 
the Giessen University Mathematical Institute Library. This book had belonged to 
Ferdinand Eisenstein (1823-1852) and then to Eugen Netto (1848-1919), student of 
Weierstrass and Kummer, before arriving at the Library. The proof, in Eisenstein’s 
hand and dated 1849, appeared on the last blank page of the book; with the help of 
Artmann and the librarian, André Weil was able to examine it and to publish it in a 
paper of 1989.’ Eisenstein’s proof started with the formula 


[ evi ysl dw = ae ae: (28.23) 


This is in fact the Fourier transform of the function 


wi! for >0, O<q <1, 
FW) = for y < 0. 


For this formula, Eisenstein referred to a 1836 paper by Dirichlet on definite integrals. 
In that paper, Dirichlet noted that the formula was first found by Euler but that Poisson 
gave the proof, with the convergence condition 0 < q < 1. Eisenstein then applied 
the Poisson summation formula 


> a= Y> om), (28.24) 


n=—-® m>=—-CO 


where d was the Fourier transformation of @, to the function 


re gore P Gg = Byer) forx > B,0O<a<1,0<6 <1, 
XX) 
0 for x < B. 
He then had 
e2ta(l—B)i e2ta(2—B)i e274(3—B)i 


a— py Ca "G-ph 
3 ie ect Aig — B)I- 1 ertiok ay 


o=—0O0 
(28.25) 
= 3 [ ertlatoyai eZTiB yq- lap 
o=—00 


; | 


_T@ a A emer | TG) ax 
ae Bye: (2x) 


Q 
iMe 
elk 
| 
R 
Sw 


o=0 


7 Weil (1989a). 
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where the last step followed from (28.23). By taking a = B = > he obtained the 
functional equation for the L-function 


1 1 1 210 (q) . qx a ee 
poe. = sin 1 
5!-q 71-4 md 2 394° 54 74 


(28.26) 


Eisenstein also observed at this point that when g was replaced by 1 —g and the two 
formulas were multiplied, he got another proof of Euler’s reflection formula, discussed 
in Section 17.1: 


ri@rd—-qgq= ange. 


28.4 Riemann’s Derivations of the Functional Equation 


It is thought that Eisenstein may have discussed his proof of the functional equation 
with Riemann, perhaps inspiring Riemann’s 1859 paper on the number of primes less 
than a given number.® In this paper, Riemann used complex analysis to give two 
new proofs of the functional equation. One proof made use of contour integration 
and the second, deeper proof employed the transformation of a theta function. The 
latter method presaged a connection between modular forms and the corresponding 
Dirichlet series obtained by applying the Mellin transform. 

The proof by contour integration started with two formulas due to Euler, though 
Riemann did not attribute them to anyone, perhaps regarding them as well known: For 
Res > 0, 


/ e™* yS-l gy = a (28.27) 
0 n 
ee) xsl 
T'(s) f(s) = / dx. (28.28) 
0 ex — 1 


We here mention that Riemann used Gauss’s notation for the gamma function: 
II(s — 1). Observe that the second formula follows from the first, using the geometric 
series expansion 


ex—1 


——— dx (28.29) 


8 Riemann (2004) pp. 135-143. 
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over a contour from +00 to +00 in the positive sense around the boundary of a region 
containing in its interior O but no other singularities of the integrand. He noted that 
this integral simplified to 


lee) s—l 
(eo FH _ git) i, “dy, (28.30) 
0 


ex —] 


provided that one used the branch of the many-valued function (—x)°~! = e—!) les(-») 
for which log(—x) was real for negative values of x. From (28.28), (28.29), and 
(28.30), he concluded that 


; fea 
2sinsa T(s)o(s) =i / dx. (28.31) 


Xx 
o el 


Riemann pointed out that this integral defined ¢(s) as an analytic function of s with 
a singularity at s = 1. In addition, we note that since 


Be hase 1 ge gee 
Foie ees esas.) a” (ee 


two of Euler’s famous formulas are immediate corollaries, though Riemann noted only 
the first one: 


(—1)" Bon 


¢(—2n)=0, and ¢(1—2n)= 
2n 


= eee (28.32) 


To obtain the functional equation, Riemann remarked at this point that for Re s < 0, 
the contour for the integral in (28.31) could be viewed as if defined (with a negative 
orientation) as the boundary of the complementary region containing the singularities 
+2nzi,n > 0 of the integrand. Since the residue at 2nzi was (—n2mi)’—!(—2zi), 
he obtained the equation 


2sinsx T(s)¢(s) = 2x)* ) ns" ((-iys 1 +74). (28.33) 


Riemann noted that by using the known properties of the gamma function, (28.33) 
could be seen as equivalent to the statement that: 


r(5) m3 ¢(s) (28.34) 


was invariant under the transformation s > 1—s. And this was the functional equation 
for (s). In his 1859 paper, Riemann noted only the first equation in (28.32), though 
he clearly knew the second one as well; when combined with the functional equation, 
this yields a new proof of Euler’s formula 


(Cay he Bs, 


$n) = Qn)! 
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Riemann wrote that the expression in (28.34) and its invariance led him to consider 
the integral for (5) and thus directed him to another important derivation of the 
functional equation. Since 


Riemann used term-by-term integration to find that 


Ss 5 QOH 5 = 2 
ea niz(s) = | Cae Oe ead, bo (28.35) 
(5) Nee 


It was proved by Cauchy and Poisson, and a little later by Jacobi, that 


oe) 


y ene re = Se ; (28.36) 


n=—C}O n>=—OO 


In Riemann’s notation, this was equivalent to 


Qwix) + 1=x72 (2v (=) as 1). 


where 


v(x) = as 
n=1 


Riemann referred to Jacobi’s Fundamenta Nova for (28.36). We note that Jacobi’s 
proof of (28.36) used elliptic functions, while Cauchy and Poisson employed Fourier 
analysis. See Section 34.10 for Jacobi’s proof and Section 34.11 for Cauchy’s proof. 
Next Riemann rewrote (28.35) as 


P (5) 2360) = [vert tare [iy (2) x Pa 
2 1 0 X 


1 1 s—3 1 
= ne ra 28.37 
+ | (« Z — x2 )ax ( ) 


-— +f w(x) ( (xi oe a ) ax. 


This reproved the functional equation because the right-hand side was invariant 
under s — 1 — s. Moreover, €(s) was once an defined for all complex s #£1.To 
emphasize the significance of the line Res = > Riemann set s = 5 | + it and denoted 
the left-hand side of (28.37) as &(t), so that he had 
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E(t) = Rs, ee [ wx)x-4 cos Ey logx | dx (28.38) 
2 4) Ji 2 
Se Se ae at 1 
E(t) = 4f — (x2 w'(x)) x” 4 cos (5 log) dx. (28.39) 
1 dx 2 


28.5 Euler’s Product for >> a 


In his 1859 paper giving the formula for the number of primes less than a given 
number, Riemann remarked that he had taken Euler’s infinite product for the zeta 
function as the starting point for his investigations. Indeed, it was Euler’s product 
representation for the zeta function that made it possible to perceive the connection 
between the zeta function and prime numbers. 

In a 1737 paper, Euler showed how to convert the series for the zeta function, 
pra i into a product.’ Euler’s insightful argument, amounting to an application 
of the fundamental theorem of arithmetic, is here presented in its original form. 
Euler let 


Then 

! ! | ! | ! | | 

an™ = an T gn T Gn T gn FO 
Removing all even numbers by subtraction, he got 

2° -1 1 1 1 
x=14 + etc. 
Multiplying by a he obtained 
2”7-1 1 1 1 
ee Te: + etc. 


28 3" Sf Qn 15" 
Again, Euler removed by subtraction all multiples of 3 so that 


27-1 3"-1 1 1 
Rn ge ee ag moe 
By continuing this process with each of the prime numbers, all numbers on the 
right-hand side except one were eliminated, yielding 


9 Bu. 1-14 pp. 217-244. E 72 Theorem 8. 
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or 


1 1 1 Oh 3” 5? 
14 { } + etc. = . . - etc. 
Qn 3m An 2?>—1 3-1] 5"7-1 


Note that this was in essence the fundamental theorem of arithmetic in analytic 
form. Euler’s 1748 book Introductio in Analysin Infinitorum made this connection 
more clear, as he expressed this infinite product in almost modern form:!° 


(1— a) (2 art i) ete 


To see the unique factorization theorem here, simply expand these fractions using the 
geometric series. 

In 1837, Dirichlet defined L-functions for which he found an analogous infinite 
product. For example, in the case of characters modulo a prime p, he stated the 
result as!! 


1 1 
ee oe 


The product was defined over all primes other than p, while w was a (p — 1)th root 
of unity. Dirichlet used this formula in his proof of his famous theorem on primes in 
arithmetic progressions. Note also that the product formula shows that the series on 
the left-hand side of (28.22) has to be positive, justifying Dirichlet’s remark on it. 


28.6 Dirichlet Characters 


Dirichlet’s construction of characters was based on a theorem first observed by Euler 
and later completely proved by Gauss in his 1801 Disquisitiones Arithmeticae. Gauss 
showed that for any prime p, the multiplicative group modulo p, whose elements 
could be represented by the integers 1, 2,..., p — 1, was acyclic group.!* This means 
that there is at least one g among these p— 1 integers such that for any n, not a multiple 
of p, there exists an integer y, such that 


g” =n (mod p). (28.40) 
This equation implies that for positive integers m and n not multiples of p, 
gin = nini= gle = gle Gnod p), 
and hence 


Yn = Ym + Yn (mod (p — 1)). (28.41) 


10 Buler (1988) p. 244. 
'1 Dirichlet (1969) vol. 1, p. 317. 
12 Gauss (1965) pp. 35-36. 
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So if w is a (p — 1)th root of unity, we have 
pln — ym tn — Ym yyYn, (28.42) 


The complex number w can be written as e2tik/(P—1) for k = 1,2,... ,p — 1. For 
any one of these p — 1 complex numbers w, Dirichlet defined a character with values 
wl, w,..., w’?-! and with the property 


Yn Ym Yn 
Be ge oe (28.43) 
ns ms (mn) 
He observed that by (28.42) 
i =l+w?. 1 1 wie. : Jo... 
1—w% x qs qs 
13 


for s > 1. Then, by the unique factorization theorem, 


1 1 
Il ; —~= Siw. =. (28.44) 


— wa 


The infinite product was defined over all primes not equal to p, and the sum was 
taken over all positive integers not divisible by p. Note that this sum can be taken over 
all positive integers with the convention that w” = 0 when n is a multiple of p. When 
w = —1, we have w” = +1, depending on whether y, is even or odd. If it is even, we 
can write the left-hand side of (28.40) as a square and hence n is a square modulo p, 
or rather, n is a quadratic residue. We can therefore write 


(-1)" = (“). (28.45) 
72 


n 


where (4) is the Legendre symbol; it is +1 when n is a quadratic residue (mod p) 
and —1 when n is a quadratic nonresidue. For this character, we can write (28.44) as 


Pp 


eC, 
I] (4) a> > ns’ 


@. Ls 


where (4) = 0 when n is a multiple of p and the product is taken over all primes not 
equal to p. 

In his 1837 paper on primes within any arithmetic progression, Dirichlet also 
defined characters modulo any positive integer m. For this purpose, he employed 
a result from Gauss’s Disquisitiones: For any odd prime p and positive integer k, 
the multiplicative group modulo p*, that is, the integers relatively prime to p and 


13 Dirichlet (1969) vol. 1, pp. 316-317. 
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represented by integers less than p*, is a cyclic group.!* This theorem enabled 
Dirichlet to define (p*) = p* — p*—! different characters corresponding to the 
¢(p*) values e2rim/(p*) m = 1,2,... ,o(p*). Letting w denote any one of these 
values, the value of the corresponding character at n where n was not divisible by p, 
would be w”. As before, y, was defined as in (28.40), with respect to a generator g 
of the multiplicative group modulo p*.!5 

For powers of 2, the situation was slightly more complex. Clearly, the multiplicative 
groups mod 2 and mod 4 are cyclic. Another result from the Disquisitiones stated!® 
that every relatively prime residue class mod 2‘, where k > 3, could be represented 
uniquely as (—1)’5”’, where y was defined to the modulus 2 and y’ to the modulus 
5 (2* ) = 2k-?, Again, Dirichlet used Gauss’s result to define the characters modulo 
powers of 2 by 


w’(w’)”, where w2=1 and (w') = 1; 
Dirichlet noted that the number of such characters was 
2-1 = gk), 
Next, Dirichlet defined characters modulo 
m = 2 phi pk... pi 
He considered an integer n relatively prime to m and assumed 


n= (-1)’5” (mod2*) and n= rh (mod p,') 


: : - kj 
where gj; was the generator of the relatively prime residue classes modulo p 7 . Then 
he gave the value of an arbitrary character at n modulo m as 


ww!) wr wm? oo whe! (28.46) 


(p-\)p"" 


Here w; was a root of w ; — 1 =O and there were 


vw m1) 


p\lm 


such characters. Dirichlet showed that with this general definition of a character, the 
product formula (28.44) would continue to hold. The operative idea behind the product 
formula was the multiplicative property of characters. 


14 Gauss (1965) pp. 55-59. 
'5 Dirichlet (1969) pp. 333-337. 
16 Gauss (1965) pp. 59-61. 
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Primes in Arithmetic Progressions 


29.1 Preliminary Remarks 


One of the great theorems of number theory states that any arithmetic progression 
l,l+k,1+2k,..., where / and k are relatively prime, contains an infinite number 
of primes. Euler conjectured this result for the particular case / = 1, probably in 
the 1750s, though it appeared in print much later.! Apparently, the general form of 
this conjecture first appeared in Legendre’s 1798 book on number theory and then 
again in later editions. Legendre thought he had proved the theorem,” but in 1801, 
Gauss remarked that the proof “does not yet seem to satisfy geometric rigor”? Again, 
in a paper of 1837,4 Dirichlet pointed out that Legendre’s proof was based on a 
lemma whose proof was inadequate, but Dirichlet verified the theorem for k prime. 
He published a demonstration of the general result two years later.> Interestingly, the 
germ of the central idea in Dirichlet’s proof came from Euler. Note that in a paper of 
1737,° Euler used the formula 


1 
14 bees (29.1) 
2 3 4 1 1 1 
learns) 
to prove that the series of the reciprocals of primes )°* 4 was divergent. Of course, this 
implied that the number of primes was infinite. It is obvious that the series and product 
in Euler’s formula are divergent but, as discussed in Chapter 16, this defect is easy to 


remedy. In the same paper, Euler studied numerous Dirichlet series and their infinite 
products, including 


! Eu. 1-4 pp. 146-162. E 596. 

2 Legendre (1808) pp. 399-406. 

3 Gauss (1965) pp. 461-462. 

4 Dirichlet (1969) vol. 1, p. 316. 

5 Dirichlet (1969) vol. 1, pp. 411-493. 

© Bu. 1-14 pp. 217-244, especially pp. 242-244. E 72. 
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(29.2) 


where the primes of the form 47 + 3 appeared with a negative sign and those of the 
form 4n + 1 had a positive sign. This led him to the series 
Ly “Ld 1 1 1 1 


Pattee 29. 
38 ST AN -A3. AT dd ey 


In a letter to Goldbach dated October 28, 1752,’ Euler wrote that he had found the 
sum of this series to be approximately 0.334980, implying that the series )~ 5 and 


= a where p and q were primes of the form 4n + 1 and 4n + 3, respectively, were 


both divergent. Euler’s results were published in a posthumous paper of 1785,° also 
containing his conjecture for the case / = 1, and in which he gave the sum of (29.3) 
as 0.3349812. 

In spite of Gauss’s and Dirichlet’s remarks on the error in Legendre’s reasoning, 
Legendre, not known for taking heed of the criticism of others, included his flawed 
proof in the 1808 and 1830 editions of his book. In a paper of 1838,° Dirichlet stated 
that he unsuccessfully tried to prove the troublesome lemma, finding it at least as 
difficult to prove as the theorem deduced from it. In 1859, Athanase Dupré (1808- 
1869) published his proof that Legendre’s lemma was false,!° for which he was 
awarded half the 1858 Gran Prix from the French Academy of Sciences. 

In his 1837 paper presented to the Berlin Academy,!'! Dirichlet wrote that he based 
his ideas on chapter 15 of Euler’s Introductio in Analysin Infinitorum. He expressed 
the sum of the reciprocals of the primes in the given arithmetic progression as an 
appropriate linear combination of the logarithms of the p — 1 L-series arising from 
the p— 1 characters modulo p. He then had to prove the divergence of this expression, 
based on the divergence of the series corresponding to the trivial character, In Lo(1). 
Then, in order to maintain the singularity of Lo(1), he had to show that the values 
Lx(1) did not vanish. For L-series arising from complex characters, Dirichlet was 
easily able to do this. However, it was much more difficult to prove that the L-series 
produced by the real character, defined by the Legendre symbol, did not vanish. To 
tackle this problem, Dirichlet first reduced the infinite series to a finite sum and 
considered two cases of primes: those of the form 4m + 3, and then 4m + 1. The 
first case was relatively easy; for the second case, he used a result on Pell’s equation, 
from the Disquisitiones. The appearance of Pell’s equation may have alerted Dirichlet 
to the connection between L-functions for real characters and quadratic forms. In fact, 
this allowed him to prove in 1839 that the class number of the binary quadratic forms 


7 Fuss (1968) pp. 586-591, especially p. 587. 

8 F596. 

9 Dirichlet (1969) vol. 1, pp. 357-374, especially p. 357. 
10 Dupré (1859). 
!1 Dirichlet (1969) vol. 1, pp. 309-312, especially p. 310. 
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of a given determinant could be evaluated in terms of the value of the L-function at 1, 
implying that the function did not vanish. ! 

With these papers, based on Euler’s work on series, Dirichlet established analytic 
number theory as a distinct new branch of mathematics; he applied infinite series to 
the derivation of the class number formula, to the problem of primes in arithmetic 
progression, and to the evaluation of Gauss sums, leading to a proof of the quadratic 
reciprocity law. Interestingly, Gauss wrote Dirichlet in 1838 that he had worked with 
similar ideas around 1801, but he regretted not finding the time to develop and publish 
them.!? Indeed, an incomplete manuscript among Gauss’s unpublished papers, now 
included in the second volume of Gauss’s collected works,'* gave a partial outline for 
the theory of the class number formula. According to Mathews: !> 


From this [Gauss’s manuscript] it appears that Gauss succeeded in determining the number of 
classes belonging to a determinant both for definite and indefinite forms; and with regard to 
definite forms it is possible to make out the method that was actually adopted. 


We parenthetically note that the ancient Babylonians considered particular cases of 
Pell’s equation x? — ny* = 1, where n is a nonsquare positive integer; in India, 
Brahmagupta in the 600s and Bhaskara in the 1100s gave procedures for solving it.!° 
William Brouncker can be credited with giving a general method for its solution in a 
1657 letter!’ to Wallis, in response to a challenge from Fermat; Lagrange finally gave 
a rigorous derivation in 1768.!8 

Dirichlet’s proof of the nonvanishing of the L-series was somewhat roundabout, 
but a more direct proof was published by the Belgian mathematician Charles de la 
Vallée-Poussin (1866-1962) in his 1896 paper “Démonstration simplifée du théoréme 
de Dirichlet sur la progression arithmétique.”!? Vallée-Poussin took the L-functions to 
be functions of a complex variable and then employed analytic function theory to give 
his elegant proof. Interestingly, he made use of a construction also given by Dirichlet. 
Vallée-Poussin, who made many contributions to various areas of analysis, studied 
at the university at Louvain under L. P. Gilbert, whom he succeeded as professor of 
mathematics at the age of 26. 

As early as 1861-62, Hermann Kinkelin of Basel studied L-functions of complex 
variables,”° proving their functional relation for characters modulo a prime power. 
And in 1889 Rudolf Lipschitz, using the Hurwitz zeta function, proved the latter result 
for general Dirichlet characters.7! Between 1895 and 1899,?* Franz Mertens gave 
proofs of the nonvanishing of the L-series by elementary methods, that is, without the 


Dirichlet (1969) vol. 1, pp. 499-502. 

For an English translation of this letter, see Scharlau and Opolka (1984) pp. 178-779. 
Gauss (1863-1927) vol. 2, p. 269. 
Mathews (1961) p. 230. 

Datta and Singh (1962) pp. 146-172. 
Wallis (1693-1699) vol. 2, p. 797. 
Lagrange (1867-1892) vol. 2, pp. 494-496. 
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use of quadratic forms or functions of a complex variable. One of these proofs used 
a technique from an 1849 paper of Dirichlet on the average behavior of the divisor 
function. This result now has many elementary proofs, but the simplest may be due to 
Paul Monsky in 1994,” based on the earlier elementary proof of A. Gelfond and Yuri 
Linnik, published in Russian in 1962.74 


29.2 Euler: Sum of Prime Reciprocals 


In his 1737 paper “Variae Observationes circa Series Infinitas,’*> Euler showed that 
the sum of the reciprocals of primes 
1 1 1 1 1 1 
festa + ete. 


Zo BB FAL 3 


was of infinite magnitude and was, moreover, the logarithm of the harmonic series 


1 1 1 
Pa ab ob iS ete. 
5 


Taking the logarithm of (29.1), Euler got 


fc aes ae eee eae tare OR ete ea eae 
DMs ign Reg ili ial aie) Cale ic Cae 


poet Ey wl Me te ek 
eae ak (a 32 0 52 ) 
af ae hie end | 
(state ); 
Boe 


He could express this relation as 


1 1 1 
elt gBtxCt+gDt etc. _ 7 4 pg tat se hah ae 


He then observed that since the harmonic series diverged to oo and the series B,C, D, 
etc. were finite, the series 


was negligible and hence 


23 Monsky (1994). 
24 English translation: Gelfond and Linnik (1966). 
25 Bu. I-14 pp. 217-244, especially pp. 242-244. E 72. 
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By taking the logarithm of both sides, he obtained his result: 


11 13 06«17 


1 
7 
1 1 1 1 
=In(14 | | + etc. }. 
2 3 4 =5 
Euler then noted that the harmonic series summed to In co and hence the sum of the 


reciprocals of the primes was In In oo. To understand this, recall that }“;_, i ~ Inn; 
see equation (20.4). 


29.3 Dirichlet: Infinitude of Primes in an Arithmetic Progression 


As discussed in our Section 13.3, in 1758-1759, Waring and Simpson, starting with 
FX) = or 9 anx", used roots of unity to obtain an expression for )°7° 9 dnn4ix”". In 
other words, they used characters of the additive group Z,, to extract from the power 
series the subsequence of terms in an arithmetic progression. Since the L-functions 
were multiplicative, Dirichlet had to define and use characters of the multiplicative 
group. An additional complication for Dirichlet was that he had to work with the 
logarithm of the Z-functions and therefore had to prove their nonvanishing. In the case 
where m = p was a prime, using some results of Gauss, he found an intricate proof 
of this fact, published in his 1837 paper “Beweiss des Satzes, dass jede unbegrenzte 
arithmetische Progression.”*° Dirichlet supposed p to be a prime and set 


2mik 


A = er-1,k =0,1,...,p—1. 


He let Ly denote the L-function defined by the product 


Yq =1 
xis) =1(1- =) | where w= 2 =er-t, 
Fi 


and the product was taken over all primes g # p. Note that y, is defined by means of 
a generator of the multiplicative cyclic group of the integers modulo p. Then 


Yq 
log Lk = Yo — pe 2s Zp 3s pom 


q° q 


To extract the primes in the arithmetic progression identical to 1 modulo p, 
Dirichlet first observed that for any integer h 


p-1, hy =0 (mod p-1), 


14+ QhY 4 Qty 4... 4 Q@-Dhy — 
0 hy #0 (mod p — 1). 


26 Dirichlet (1969) vol. 1, pp. 313-342. 
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It followed that 


1 1 1 1 1 
log(LoL1 ++: Lp-2) = (p— 1) (x= | ry Dies | Zp oe: | ), (29.4) 


where the primes q in the first sum satisfied g = 1 (mod p); those in the second sum 
satisfied g? = 1 (mod p); those in the third g? = 1 (mod p); and so on. The second 
and later sums were convergent for s > 1; to show that }> 7 was divergent, Dirichlet 


had to focus on the behavior of Lo, Lj, ...,Lp—2 as s > 1. Dirichlet first expressed 
the series as an integral; he calculated that for any positive real number k, 


1 1 1 
S= 
kite (k+ 1)l+e (k +2)l+e 
1 } LE \ at 
= a! log? | — dx 
T+ p) Jo x/1-x 
1 1 oe oe 1 1 
esate al = — Tx } log? (=) dx. 
p TU+p)Jo \1—x  log(,) x 


He observed that the integral was convergent as p — 0+. Note that since the 
series Lo(s) is given by )> = where the sum is over all integers m not divisible by p, 
Dirichlet could write 


Lol +p)= > Y —. 


Next, he applied the foregoing integral representation for the series to obtain 


= ae = 5 tO) 
= (m + Ip)!+e = pte Cas = Pp T p 


1 1=0 


where $(p) had a finite limit as o + 07. So Dirichlet could conclude that 


p-1 1 
Lol + p) = -—+ (0), 
Dp p 


where lim,_,9+ @(e) was finite. This implied that log Lo(1 + p) behaved like — log p 
as p — O°. Dirichlet also showed that the series L1(1), L2(1),... ,Lp—2(1) were 
convergent. Thus, if L ;(1) 4 Ofor j = 1,2,..., p—2, then the product LoL, --- Lp—2 
had to diverge as p —> 07, and the series )~ | for g =1 (mod p) would also diverge. 
This proved that there existed an infinity of primes of the form p/ + 1. 

Dirichlet found a simple proof that for j# a L;(1)40. For such a 
j,2°-|!-J Z QJ; Dirichlet therefore considered the product L jl p—1—j;- Recall that 


1 life) 
ASU beg J, tar 


1 
og* (=) dx = w(s)+ x(s)V—1. 
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Dirichlet noted that L;(s) was differentiable for s > 0, and hence by the mean 
value theorem he had 


wd +p) = wl) + pw + 4p), x1 +p) = x() + px’ + €p), 


where 0 < 6 < land 0 < € < 1. Since L»_\_;(s) was the complex conjugate of 
L ;(s), he got 


Lp-1-j(s)Lj(s) = w7(s) + x7(8). 


Next, if L;(1) = 0, then L,_;~;(1) = 0. This implied that (1) = 0 and x (1) = 0. 
Thus, 


log L (1+ p)Lp-1-j;(1 + p) = log p” (v7 + dp) +x? + <p)) 


1 , 
= —2log ; + log (W *(1 + 5p) + x(1 + €p)). 
These calculations implied that if L ;(1) = 0 for j # po then 
1 
log LoL jLp-1—j = — log = + b(p), 


and the term on the left-hand side tended to —oo as p > 0°. Clearly, log(LoL) --- 
Ly-2) was positive from (29.4) so Dirichlet had come to a contradiction. This 
completed the proof for complex characters. 


Dirichlet then dealt with the difficult case in which j = et In this case,@ 2 = 
e”' = —] and hence the character and the series L »-1 were real and given by 
2 


Sn \ id 
Lag). (“) = 
2 “= \P/ P* 

Recall Dirichlet’s results from Chapter 28: When p = 3 (mod 4), 
n\ 1 
ss) 

Gan pal LP-D 
and when p = 1 (mod 4), 
n\1 1. J]sin(&t 
3 np 8 Tsim 
P JP i 
where a and b were quadratic residues and nonresidues, respectively, modulo p. For 
p =3 (mod 4), Dirichlet noted that 


p-l 


Yiat> b= Sim= po = an odd integer. 


m=1 
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Hence for p =3 (mod 4), )* b—)> a could not be zero and Lp (1) £0. For p=1 
(mod 4), Dirichlet used a result Gauss proved in section 357 of his Disquisitiones. 
This important result in cyclotomy stated that 


27] (x-e? ) =¥- Zp, 2] ]| (x-e ) =¥ +20, (29.5) 
a b 


where Y and Z were polynomials in x with integral coefficients; hence, Gauss had 


p-l 
Qnik xP—] 
y?— pz? =4 ( = 7 )=4 
ees Ne =i 


Dirichlet set g = Y(1), h = Z(1) so that g and h were integers and g?— ph? = 4p; 
he could conclude that g was divisible by p. He could then set g = pk to obtain 
h? — pk? = —4. Since p could not divide 4, he could write that h # 0. Next, when 
x = 1 in (29.5), he got 


mia __ mia 
Pp 


2mia +1 -l . Pr 
21] (1 = er) —2°F (-1)% et! Lae I] —— 
a a 
ap Gly 7? | [sin (=) =2'> T]sin (=). 
a P a P 


Note that this last equation depends on the fact that when p is of the form 4n + 1, a 
and p — a are both quadratic residues, and the residues can be grouped in pairs. There 


are pot such pairs, and it follows that 


Ee ee 
a b 


Similarly, 


and thus, because h + 0, 


Ts sin (2) _ kypth 
TI, sin (<z) - k./p—h 


#1 


This proved that L »-1(s) did not vanish at s = 1 and also that the number of 


primes = 1 (mod p) was infinite. To show that the number of primes = m (mod p) 
was infinite, Dirichlet gave a modified argument. He considered the sum 
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log Lo + Q7” log Ly + 277%" Jog Ly + +++ + QTP-V¥m Jog bg3 


1 1 1 1 1 
=0-)(a5 pe 2+42p 3 Ls Gite ), 


q 


where the primes q in the first sum satisfied g =m (mod p) and those in the kth 
sum satisfied g‘ =m (mod p). Since he had already proved that log L}, log Lo, ..., 
log L p—2 were finite as p —> O* and that log Lo behaved like log(4), Dirichlet could 


conclude that, when the sum was taken over primes =m mod p, >~> diverged. 


29.4 Class Number and L, (1) 


In a paper of 1838 published in Crelle’s Journal, “Sur usage des séries infinies 
dans la théorie des nombres,”*’ Dirichlet worked out some particular cases of his 
class number formula. In this formula, he expressed the class number, a necessarily 
nonvanishing quantity, in terms of L,(1). In order to give a definition of class 
number, we first observe that for a,b,c integers, b? — ac is called the determinant 
or discriminant of the quadratic form ax” + 2bxy + cy*. Two quadratic forms with 
the same determinant are in the same class if a linear substitution x = ax’ + By’ and 
y = yx’ + dy’ with ay — BS = | transforms one quadratic form into the other. This 
basic definition can be traced to Lagrange. In addition, Lagrange proved that there 
was a finite number of such classes (called the class number) for a given negative 
discriminant. Note that Lagrange worked with b instead of 2b. 

In his 1838 paper, Dirichlet considered quadratic forms of determinant —q with q 
prime. He separated his proof into the two cases gq = 4v + 3 and g = 4v + 1; we 
present Dirichlet’s proof of the former case. He denoted by f the primes for which the 
discriminant —q was a quadratic residue and by g the primes for which it was not: 


9-0 (9-0) 


He then considered the L-series relations 


ee * ae 

1 1 n\ 1 
I; lle le 

1 ( 1 
Il; * Ls - =) oe 


27 Dirichlet (1969) vol. 1, pp. 357-374. 


29.4 Class Number and L, (1) 115 


where the n were odd numbers not divisible by g. He deduced that 


ZEAE ase Rtotam+ J=DS 
emi eer Oa air ae 
(29.6) 


where the summation was over odd integers m divisible only by primes of the type f; 
j4 was the number of distinct primes f by which m was divisible. Dirichlet denoted 
the inequivalent quadratic forms of determinant —q by 


ax? 2 bay cya’ EO xy Lely 


and then observed that articles 180, 155, 156, and 105 of Gauss’s Disquisitiones 
implied that 


jh 1 1 
25° — =) +> > ida 
ms (ax? + 2bxy + cy)s (a’x? + 2b’xy + cl y2)s 
(29.7) 


where the summations on the right-hand side were taken over positive as well as 
negative values of x and y relatively prime to one another. From this Dirichlet deduced 
that 


1 n\ 1 1 1 
Que ) =(%) ns pir: De (ax? + 2bxy + cy?)s eae eee) 


Without giving details, Dirichlet remarked in this paper that “by means of geometric 
considerations” it could be proved that 


1 1 q-1 
ce = ot. 29.9 
» n2+p) pS (ax? + 2bsy+cy*)!+e 2qj/q p an ia 


Thus, given / different inequivalent forms of determinant —q, that is, if h were the 
class number, then the right-hand side of (29.8) could be expressed as 


hqq-l) x 


: > Or, (29.10) 
2q/q 


On the other hand, since 


he had 


1 -1 1 
a ao a . asp > O°. (29.11) 
n p 
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He applied (29.10) and (29.11) to (29.8) and found a special case of his famous class 


number formula 
2/¢ n\ 1 
h= —— a 29.12 
eG) 8.12) 


a formula that expressed the class number in terms of the value of an L-series at 
s = |. Since the class number had to be at least one, the series had a nonzero value. 
In 1839, Dirichlet published a proof along similar lines of this result for arbitrary 
negative determinants. Recall that Dirichlet made liberal use of results from Gauss in 
his proofs; his contemporaries reported that his copy of the Disquisitiones was never 
kept on the shelf, but on his writing table, and that it always accompanied him on 
his travels. Through his lectures on number theory, published by Dedekind, Dirichlet 
made the work of Gauss accessible to all his students. 


29.5 Vallée-Poussin’s Complex Analytic Proof of L, (1) 4 0 


Before he published his famous work on the distribution of primes, Vallée-Poussin 
published a paper presenting a simpler proof of Dirichlet’s theorem on primes in 
arithmetic progressions, observing that his proof was more natural since it did not 
depend on the theory of quadratic forms. In this paper, presented to the Belgian 
Academy in 1896,78 Vallée-Poussin defined L x(s), where x was a character modulo 
an integer M, as a function of a complex variable s. By a simple argument, he showed 
that for the principal character xo, Ly,(s) was an analytic function for Re s > 0, 
except for a simple pole at s = 1. On the other hand, for any nonprincipal character, 
the corresponding L-function was analytic for Re s > O with no exception. Vallée- 
Poussin’s proof that L,(1) 4 0 employed a function similar to one constructed by 
Dirichlet in his discussion of quadratic forms of negative discriminant. He let x be a 
real nonprincipal character; he let gj denote primes for which x (qi) = | and let q2 
denote primes for which x (q2) = —1. He set 


_ Lyx (s)Lyo(s) 
v(s)= Oey : 


Then, for Re s > 1, he observed that 


HO te ge 10 + 2g7* + 2g, 4 a? Dee 
1 


where a, > 0. In addition, since Ly,(2s) had a pole ats = 5 he deduced that 
w(s) = Oats = 5: Vallée-Poussin also observed that there was at least one prime 
qi. If not, then y(s) = 1 for Re s > 1 and by analytic continuation v5) = |. This 
contradicted w(5) = 0, and hence a, > 0 for some n in }> ae 


28 Vallée-Poussin (1896a). 
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In order to obtain a proof by contradiction, Vallée-Poussin next assumed that 
L,(s) = 0 at s = 1, so that this zero would cancel the pole of L,,(s) ats = 1 
and L,(s)L,y,(s) would be analytic for Re s > 0. Again for Re s > 1, the derivatives 
of y(s) were given by 


(n) i ca dn (log n)” 
wOP(s) =D" A, m= 12,3... 


n=1 


He let a > 0 so that w(1 + a + ft) had radius of convergence greater than a + 5 and 


2 
witath=wWilt+a)+tw(+a)4 =v Bay Spaes ¢ 


Denoting (—1)”W( + a) by Am, it was clear that A,, > 0 and fort = —(a+ 5) 


1 1 she 
v(5)=va ta) + (« 5)A @ Aot+-:-- 


Since VG) = O and the all the terms on the right were positive, Vallée-Poussin arrived 
at the necessary contradiction. 


29.6 Gelfond and Linnik: Proof of L, (1) 4 0 


Gelfond and Linnik’s proof that L,(1) 4 0 for any real nonprincipal character x 
modulo m was presented in their 1962 book.?? They made the observation that if 
¢ denoted the Euler totient function and T(n) = }“7_, x(k), then for any positive 
integer N 


N 


x(k) 
aia 


k=n 


IT(N) —-T(n—1)| <o(m) and ey 
n 


(29.13) 


Note that the second inequality follows by partial summation. Gelfond and Linnik 
defined the function 


UL. 
U(x) = ) a = ) ) x(d) | x” (29.14) 
n=1 n=1 \ dln 
and showed that 
1 
U > ———.. 29.15 
eas las 


29 Gelfond and Linnik (1966). 
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They then proved that if L, (1) = 0, then 


U(x) =O ¢ : ) : (29.16) 
1-—x 


Clearly, (29.15) and (29.16) were in contradiction to one another, indicating that 
the assumption L, (1) = 0 had to be false. Next, to prove (29.15), Gelfond and Linnik 
made the important observation that 


Ss 


fr= >i x@M=] [C+ x@e +--+ x (pe), (2 = py py ph). 
d|n k=1 


Since x(p) = 1, — 1, or 0, it followed that f,, > 0. Then, if n were a square, all the 
vg would be even and each of the s factors of f, would be > 1. Thus, f, > 1 whenn 
was a square. Hence, with 1 > x > 55x > x0 


ce) OO! 5 
U(x) > ) x =) x dt + O(1) 
1 
n=1 


_ 1 Ae nso = Jt 

-— | e? dt +00) =~ — + 00) 
fir 1 I 

= (yeaa 
2 (-ma-(-x))? ae 


We note that Gelfond and Linnik set x > xg for some x9 such that the inequalities 
would hold. To prove (29.16), they set 


k=n 
Lyd 
Ry (x) = U(x) - at 
[oe] [oe] [o,@) 
- xe x(n) x” xe L,() 
= DAO d n 1—x" pa oe ae 1—-x 
[oe] xn xn [o,2) 
= Paty oe 29.17 
Lx (a =) Dit ( 9 ) 


To see how they arrived at the last equation, observe that 
1 CO 
os (do. = Sn+i)x” = 1,0) 


n=1 
1 (oe) CO 
=i (se +0 Sn4ix"(l — x) - si) = 15 ae 
n= 


n=1 
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Next, by (29.13), S, = o(+), and hence they could write the second sum in 
(29.17) as 


S sus =0(Sr =). o(m), 


n=1 
By an application of Abel’s summation by parts to the first sum in (29.17), they got 


n 


= (n) x x 
Lx ( a a) 


lee) B x" 


yr) xn xn antl antl 
— n + 
=| 1—x” nd—-x) 1—x"!l (M4+1)0—x) 
o(m) SS x” yrs x” (1 — x)x" 
< 
1 ne : Ptxte-turml Lextee-tx"  n(nt+1) n+1 
om) > x? ntl x” 
arg Ltxtee txt] Ltxtee- tx" n(tntl) 
oo n 


x” 1 
= o(in ) 
marl 1-x 


Note that the final inequality was possible because the expression in the first sum 
was positive; then, since the series was telescoping, it would sum to ae It then 
followed from (29.17) that if L,(1) = 0, then U(x) = O (in ry): this completed 
the proof. 


29.7 Monsky’s Proof That L, (1) 4 0 


In 1994, Paul Monsky showed that Gelfond’s proof of L, (1) 40 could be consider- 
ably simplified.*° Use of the strong result (29.15) turned out to be avoidable. Observe 
that lim,.;- U(x) = oo, because U(x) > ur x Monsky demonstrated that if 
L,(1) vanished, then U(x) was bounded. This contradiction proved the result. His 


simplification took place in the first sum, Rj (x), in (29.17), where 


Sl x@) xx" 57 x), 
ne Gn a as 


i= 
n=1 


30 Monsky (1994). 
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Monsky first showed that b) > bz > b3 > ---. He noted that 


1 xe 


ntl) tate + xed tate +27) 


bn — Davi = 


Then, by the inequality of the arithmetic and geometric means 


n-1 


Ltxt---tx" 1! > nxt > nx 


Ns 


and 
Lx te tx" > (at 1)x?. 


Hence, by > bn41. Applying Abel’s partial summation, Monsky wrote 


ye aa (29.18) 
a 1-x 1—-x 


He next assumed that L,(1) = 0, and this implied U(x) = R(x); but (29.18) in 
turn implied that lim,—,1— U(x) could not be infinite. Thus, he got a contradiction to 
prove the result. 


29.8 Exercises 


(1) Investigate Chebyshev’s assertion in an 1853 letter to Fuss that 


lim (e-% pode pote gee. pe pales p19 1. 428 +) 
c>0 


diverges to +00. See Chebyshev (1899-1907) vol. 1, p. 697. See also Hardy 
(1966-1979) vol. 2, pp. 42-49, where Hardy and Littlewood derive it from the 
extended Riemann hypothesis for the series 1~* —3~* +5-* —7-* +.---. Note 
the editor’s comment on this result on p. 98. 


(2) Let x be a real nonprincipal character modulo m, and let f(n) = >¢ x(d), 
where the sum is over all divisors d of n. Show that if 
G(x) = Ltn) then lim G(x) =o. 
x—>>CO 
n<x 
Show also that 
G(x) = 2./xL(1,x) + O(1). 
Conclude that if L(1, x ) = 0, a contradiction ensues. See Mertens (1895). 
(3) Suppose x is a primitive character mod d, that is, there does not exist a proper 


divisor m of d such that x(a) = x(b) whenever a = b and ab is prime to d. 
Define the Gauss sum G(x) by 


2nia 


d 
Ga) =i x@ee. 
a=1 
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Prove that 


2nina 


d 
Xn)G(xX) = Yo x@eer. 
a=1 


This result is due to Vallée-Poussin (2000) vol. 1, pp. 358-362; he also defined 
the concept of a primitive character. 


(4) Define the Euler polynomials E,,(t) by the relation 


wm 


Let x be a primitive character (mod d) and let k be a positive integer such that 
x(-l= (—1)*. Prove that if q is the greatest integer in fi. then 


(kK-1)! a 1 exes 2a 
Gane SMEG, 0) = 2 — gy 2a HOE (=) 


This formula is due to Shimura (2007) p. 35; in this book, Shimura observed 
that most books and papers give only one result on the values of the Dirichlet 
L-function. Shimura derived several new formulas for these values, including 
the foregoing example. For a discussion of Shimura’s well-known conjecture 
related to Fermat’s theorem, see Gouvéa (1994) and Shimura (2008). 

Prove that if the set of positive integers is partitioned into a disjoint union of two 
nonempty subsets, then at least one of the subsets must contain arbitrarily long 
arithmetic progressions. This result was conjectured by I. Schur and proved 
in 1927 by van der Waerden, who studied under E. Noether. The reader may 
enjoy reading the proof in Khinchin (1998), a book originally written in 1945 
as a letter to a soldier recovering from his wounds. In 1927, van der Waerden’s 
theorem was a somewhat isolated result, but it has now become a part of 
Ramsey theory, an important area of combinatorics. See, Graham, Rothschild, 
and Spencer (1990). Robert Ellis’s algebraic methods in topological dynamics 
also have applications to this topic. See Ellis, Ellis, and Nerurkar (2000). 
Prove that the primes contain arbitrarily long arithmetic progressions. For 
this result of Ben Green and Terrence Tao, see Green’s article in Duke and 
Tschinkel (2007). 


(5 


wm 


(6 


wm 


29.9 Notes on the Literature 


A historical account of the topic of this chapter was given by Littlewood’s student 
Davenport (1980). This book was very influential because of its treatment of the large 
sieve, arelatively new topic at the time of first publication in 1967. The extensive notes 
in each chapter refer to numerous papers and books on this and related topics. 
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Distribution of Primes: Early Results 


30.1. Preliminary Remarks 


Prime numbers appear to be distributed among the integers in a random way. 
Mathematicians have searched for a pattern or patterns in the sequence of primes, 
discovering many interesting features and properties of primes and sequences of 
primes, but many fundamental questions remain outstanding. In the area of prime 
number distribution, even apparently very elementary results can be enlightening. For 
example, in 1737, Euler proved that the series )* - where p is prime, was divergent.! 


He also knew that )°°° | * was convergent. By combining these results, one may see 
that the prime numbers are more numerous than the square numbers. Thus, for large 
enough x, we expect that zr (x), the number of primes less than or equal to x, satisfies 
(x) > ./x. In fact, extending this type of reasoning, we may expect that 


(x) > x18 (30.1) 


for any 6 > O and x correspondingly large enough. Recall that, in fact, Euler had a 
fairly definite idea of how the series of prime reciprocals diverged: 


S a In(In 00). (30.2) 
Pp 


From this, it can easily be shown, by means of a nonrigorous, probabilistic 
argument, that the density of primes in the interval (1,x) is approximately — In 
1791 or 1792, when he was about 15 years old, Gauss conjectured just this result. 

Gauss never published anything on the distribution of primes, but in 1849 he wrote 
a letter to the astronomer J. F. Encke giving some insight into his thought in this area.” 
Gauss recounted that he had started making a table of prime numbers from a very 
young age, noting the number of primes in each chiliad, or interval of a thousand. As 


! Bu. 1-14 pp. 217-244. E 72. 
2 For an English translation of this letter, see Goldstein (1973). 
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a consequence of this work, around 1792 he wrote the following remark in the margin 
of his copy of J. C. Schulze’s mathematical tables: 


Primzahlen unter a (= co) a 
a 


We may understand this to mean that 


=: (30.3) 


This is the prime number theorem. Let us see how Gauss may have come to this 
conclusion. Consider the following table: 


x r(x) (x)/x 
10 4 0.4 
100 25 0.25 


1000 168 0.168 
10000 1229 0.1229 
100000 = 9592 0.09592 
1000000 78498 0.078498 


Look at the column for #@) If we divide 0.4, the number in the first row, by 2, 
3, 4, 5, 6, then we get approximately the numbers in the second, third, fourth, fifth, 
and sixth rows. The result is even nicer if we change the 0.4 to 0.5 and then do the 
division. So if we write a) as + a , then f(x) has the property that f(10") = nf (10) 
forn = 2,3,4,5,6. This calculation strongly suggests that f(x) is the logarithmic 
function. Moreover, Fay) = 0.4 and In10 = 2.3; this may have led Gauss to his 
conjecture that f(x) = Inx. 

In his letter to Encke, Gauss suggested the approximation m(x) ~* 1S a. In fact, 
he gave the following table of values for (x) and the corresponding values of the 
integral: 


x (x) cs se error 


500000 41556 41606.4 +50.4 
1000000 =78501 79627.5  +126.5 
1500000 114112 114263.1 4151.1 
2000000 148883 149054.8 +171.8 
2500000 183016 183245.0 +229.0 
3000000 216745 216970.6 +225.6. 


Observe that there are inaccuracies in this table. Gauss made mistakes in his 
extensive calculations of primes, but the number of his mistakes is surprisingly small. 
For example, his value of the number of primes less than a million was overestimated 
by three, while he underestimated those under three million by 72. 
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In an 1810 letter? to the astronomer Olbers, F. W. Bessel (1784-1846) mentioned 
the logarithmic integral, now defined by 


l-e x dt x dt 
li(x) = lim / +f —_ = / — + 1.0451. (30.4) 
e>0\ Jo l+e In t 2 Int 


Gauss had already computed this integral for several values of x and Bessel noted 
in his letter that he learned from Gauss that 7(4,000,000) = 33,859, while the 
corresponding value of the logarithmic integral was 33,922.621995. 

In his 1798 book on number theory, Legendre made a similar conjecture: that 
ance was a good approximation of (x) for suitable A and B. In the second edition 
of his book, published in 1808, he gave the values A = 1 and B = —1.08366.4 Gauss 
observed in his letter to Encke that as the value of x was made larger, the value of B 
must likewise increase. However, Gauss was unwilling to conjecture that B — —1 as 
x — oo. It is interesting to note that while Gauss was writing these thoughts to Encke, 
the Russian mathematician Chebyshev was developing his ideas on prime numbers, 
showing that if B tended to a limit as x > 00, then the limit had to be —1.> 

In 1849, the Russian Academy of Sciences published a collection of Euler’s papers 
on number theory. In 1847, the editor, Viktor Bunyakovski, solicited Chebyshev’s 
participation in this project, thereby arousing his interest in number theory. Thus, 
in 1849 Chebyshev defended his doctoral thesis on theory of congruences, one of 
whose appendices discussed the number of primes not exceeding a given number. He 
there expressed doubt about the accuracy of Legendre’s formula. He then went on to 
prove that if was any fixed nonnegative integer and p was a positive real variable, 
then the sum 


= 1 \ In" x 
Yo (ze + 1) — n(x) ) (30.5) 


Inx / x!t+e 
— 


considered as a function of p, approached a finite limit as  — 0. We mention that 
Chebyshev wrote ¢(x) for z(x). From this theorem, he deduced that rea — Inx 
could not have a limit other than —1 as x — oo. He then observed that this result 
contradicted Legendre’s formula, under which the limit was given as — 1.08366. 
Chebyshev wrote a second paper on prime numbers in 1850.° This important 
work was apparently motivated by Joseph Bertrand’s conjecture that for all integers 
n > 3, there was at least one prime between n and 2n — 2. In 1845, Bertrand used this 
conjecture to prove a theorem on symmetric functions.’ In group theoretic terms, the 
theorem states that the index of a proper subgroup of the symmetric group S,, is either 
2 or > n. Chebyshev proved Bertrand’s conjecture using Stirling’s approximation. 


3 See Erman (1852) vol. 1, p. 238. 

4 Legendre (1808) p. 394. 

7 Chebyshev (1899-1907) vol. 1, pp. 27-70. 
© Chebyshev (1850). 

7 Bertrand (1845). 
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He also showed that the series )> 2 sig converged. The results he obtained implied 
the double inequality 
(x) 
0.92129 < —— < 1.10555. (30.6) 
Inx 


In a paper of 1881,° Sylvester used Chebyshev’s analysis to give improved bounds, 
obtaining 0.95695 for the lower bound and 1.04423 for the upper bound. Though 
Schur and others have succeeded in narrowing the gap between the bounds,’ it appears 
that Chebyshev’s methods cannot be developed to give a proof of the prime number 
theorem. We note that, in order to prove Bertrand’s conjecture, Chebyshev defined two 
arithmetical functions of interest even today: 


A(x) = So Inp, (30.7) 
psx 
W(%) = O(x) + (x2) + (23) 4O(K4) Ee, (30.8) 


In fact, Chebyshev proved the inequalities 


5 6 5 5 
ae ind < W(x) < 54% qe? nx + I, (30.9) 


where 


233355 
A=In = 0.92129202.... 


He then used inequalities (30.9) to indicate a method for obtaining the result for 2 (x), 


though he did not give the results explicitly. Chebyshev also proved that if limy— 9 


ve) existed, its value was 1. Note that this implies that if lim,— oo BG) exists, then 
I 


nx 


this limit too must be 1. 

At the end of his paper, Sylvester noted that for a proof of the prime number 
theorem, “‘we shall probably have to wait until some one is born into the world as 
far surpassing Tchebycheff in insight and penetration as Tchebycheff has proved 
himself superior in these qualities to the ordinary run of mankind.’ Chebyshev’s 
elementary but powerful methods formed the basis of a new topic, elementary methods 
in analytic number theory, and also served as motivation for Alphonse de Polignac 
(1826-1863) and Franz Mertens (1840-1927) to firmly establish this new subject.!° 
In 1874, Mertens showed that Chebyshev’s results could be used to obtain asymptotic 
formulas for the series 


inp 1 
ys and irs 


psx psx 


8 Sylvester (1973) vol. 3, pp. 530-545. 
9 Shur (1929). 
10 Polignac (1857) and Mertens (1874a) and (1874b). 
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Thus, he proved the following refinement of a result of de Polignac: 


yo 22 a0), (30.10) 


psx 


where O(1) denoted a quantity bounded as x — oo. Mertens also gave a more precise 
formulation of Euler’s 1737 result (30.2): 


1 1 
Yo =minx+c+0(—), (30.11) 


In x 
psx 


where 
soe 1 
Cnt lag (30.12) 


and y denoted Euler’s constant. 

In his famous paper of 1859,!! Riemann introduced ideas through which the prime 
number theorem would eventually be proved. Riemann’s interest in prime number 
theory was not surprising, surrounded as he was by great researchers in this field. 
Riemann began his paper by mentioning Gauss, Euler, and his good friend and teacher 
Dirichlet, writing that their attention to the subject would surely justify its further 
study. He did not mention Chebyshev, but he was familiar with the work of Chebyshev 
to whom he sent a copy of his paper. Also, we know that as a student Riemann studied 
Legendre’s number theory book very carefully. Moreover, Dirichlet stated in a note 
of 1838!* that his analytic methods for studying primes could provide a proof of 
Legendre’s conjecture related to the prime number theorem; Dirichlet, however, did 
not publish any ideas in this direction.!° 

Riemann based his investigation of 7 (x) on Euler’s product formula 


co) = =[]a- ey. 
n=1 Pp 


His innovation here was to take s to be a complex variable with Res > 1. He then 
defined ¢(s) as a contour integral, thereby extending its domain to the whole complex 
plane, except for the pole at s = 1. He used the Euler product to show that 


log f(s) =| Fi i Regt (30.13) 
s 1 


where 


f(x) = F(x) 4 5Fet) =F!) Pree, (30.14) 


11 Riemann (1859). 
12. Dirichlet (1969) vol. 1, pp. 353-356. 


13 Dirichlet’s 1838 note was on asymptotic formulas in number theory; for his work in this area, see Dirichlet 
(1969) vol. 2, pp. 51-66 and pp. 99-104. 
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We here write log because complex variables are involved. Riemann defined F(x) 
as the number of primes less than x when x was not prime; but when x was a prime, 


F(x +0) + F(x — 0) 


F(x) = 5 


Thus, Riemann’s F(x) was essentially (x). He obtained the integral representa- 
tion for f(x) by a method we now call Mellin inversion. Actually, he applied the 
Fourier inversion to get 


1 a+ooi ] 
fo) = =| Oy area (30.15) 
20 Ja—ooi Ss 


To evaluate this integral, Riemann defined the entire function 


S Ss 
E(s)=(s -— DM (5) nit (s), (30.16) 


where II(s) = sI'(s). He then obtained an infinite product (or Hadamard product) for 
&(s), given by 


&(s) = €(0) [| (1 - =), (30.17) 
p p 


To use this formula effectively in (30.15), one must first understand the distribution 
of the zeros p. It is easy to show that 0 < Rep < 1. It follows from the functional 
equation for ¢(s) that if p is a zero, then so is | — p. Riemann then observed that 
the number of roots of o whose imaginary parts lay between 0 and some value T was 
approximately 


(30.18) 


where the relative error was of the order rt He sketched a one-sentence proof of this 
result and added that the estimate for the number of zeros with Re op = 5 was about the 
same as in (30.18). He remarked that it was very likely, though his passing attempts 
to prove it had failed, that all the roots had Re p = 5. This is the famous Riemann 
hypothesis. 

By combining (30.15) and (30.17), and assuming the truth of his hypothesis, 
Riemann derived the formula 


F(x) = liz) - oe (lie?) + li@et*")) 


a 


‘f Bt OE eg AG) (30.19) 
Ue eat eiggg | eee 
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where the sum )°,, was taken over all positive @ such that 5 + ia was a zero of &(s). 
Note that €(0) = 7m though due to some confusion in Riemann’s notation, he obtained 
a different value. 

In the final remarks in his paper, Riemann noted first that F(x) (or 2(x)) could be 
obtained from f(x) by the inversion 


oe) 


M(m) . 1 
F(x) = —= if (am); 
where j(m) was the Mobius function. We remark that Riemann did not use the 
brief notation j4(m). He also noted that the approximation F(x) = li(x) was correct 
only to an order of magnitude x2, yielding a value somewhat too large, while better 
approximation was given by 


i= iG LEE HAS aoa (30.20) 
2 3 5 6 


Apart from the Riemann hypothesis, the most difficult part of Riemann’s paper 
was his factorization of &(s). Indeed, Weierstrass had to develop his theory of product 
representations of entire functions before even the simpler aspects of €(s) could be 
tackled. Then in 1893, Jacques Hadamard (1865-1963) worked out the theory of 
factorization of entire functions of a finite order and applied it to &(s).!+ That set 
the stage for his 1896 proof of the prime number theorem.!° Briefly, Hadamard first 
proved that ¢(s) had no zeros on the line Re s = 1. Then, using earlier ideas of Cahen 
and Halphen, he applied Mellin inversion to an integral of a weighted average, say 
A(x), of Chebyshev’s arithmetical function 0(x). From this inversion, Hadamard 
derived the asymptotic behavior of A(x) and this in turn yielded the asymptotic 
behavior of 6 (x), that 


lim OY 


X>0O X 


1. 


This proved the prime number theorem. It is interesting that in an 1885 letter to 
Hermite,!® Stieltjes claimed to have a proof of the Riemann hypothesis. Aware of this 
claim, Hadamard remarked that since Stieltjes had not published his proof, he himself 
would put forward a proof of the simpler result. 

Also in 1896, C. J. de la Vallée-Poussin published his own proof of the prime 
number theorem (PNT), based on similar ideas.!7 After these proofs appeared, 
research on the prime number theorem centered around efforts to simplify the proof 
and to understand its logical structure. E. Landau, G. H. Hardy, J. E. Littlewood, and 
N. Wiener were the main contributors to this endeavor. In 1903, Landau found a new 


14 Hadamard (1893). 

'5, Hadamard (1896). 

16 See letter 77 of Baillaud and Bourget (1905). 
17 Vallée-Poussin (1896b). 
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proof of the prime number theorem,!* not dependent on Hadamard’s theory of entire 
functions or on the functional relation for the zeta function. Landau required only that 
€(s) could be continued slightly to the left of Re s = 1. This method could be extended 
to the Dedekind zeta function for number fields and Landau used it to state and prove 
the prime ideal theorem. 

Hardy, Littlewood, and Wiener explicated the key role of Tauberian theorems in 
prime number theory. It became clear from their work that the prime number theorem 
was equivalent to the statement that ¢(1 + it) ~ 0 for real t. On the basis of this 
result, Hardy expected that the zeta function would play a crucial role in any proof 
of the PNT. But two years after Hardy’s death, Atle Selberg and Paul Erdés found an 
elementary proof of the PNT, obviously without zeta function theory.!° 


30.2 Chebyshev on Legendre’s Formula 


Recall that Euler proved the divergence of )° , a where p was prime, by comparing it 


with InQd re, 1). In his 1848 paper,?? Chebyshev followed up on this work, proving 
in his first theorem the existence of the limit 


; np wl 
lim & pike ye tr] (30.21) 
P k 


and, more generally, of the limit after taking derivatives with respect to p, 


: ln’ p-InP 
ath ye Tp > peut fe 2 =1,2,3,.... (30.22) 
ne P P k=2 


In order to obtain information about (x), the number of primes less than x, he wrote 
the series in (30.22) as 


Inx / x!+e 


> (ne ey jes = (30.23) 


Then since 


1 ee dt 1 
=O(|- as x > O, 
Inx - Int x 


Chebyshev deduced the important corollary of the existence of the limit 


eee ( el “| In” x 
lim m(x +1)—7(x) i; : (30.24) 
2 x 


p>0 Int} x!+e 
x= 


18 Landau (1903). 
19. Selberg (1949) and Erdéos (1949). 
20 Chebyshev (1848). 
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From this, Chebyshev proceeded to derive his second theorem: For any positive 
real number a, any positive integer n, and for infinitely many integer values of x, 


* dt ax 
nes / (30.25) 
2 


Int In" x 


and with the same conditions on @ and n, for infinitely many integer values of x, 


* dt ax 


> Int In"x’ 


W(x) < 


(30.26) 


Using (30.25) and (30.26), Chebyshev could state his remarkable result that if 


ie ee) 1 (x) 

1 or 

x—oo [* dt x00 te 
2 Int In x 


existed, then its value had to be 1. Of course, he was unable to show existence here, 
and that was the essence of the PNT. From (30.25) and (30.26), he also deduced 
that if 


existed, then it had to be —1. He supposed the limit to be L, so that there would exist 
an N such that for x > N, 


| ee ey cae ie eee ERE) 
(x) 


But by (30.25), there would be an infinite number of integers x > N such that 


x 


x dt ax 


2 Int In” x 


—-Inx>L-e, 


or 


x-(dnx-D(f> at aes 
< —z U; wt aN, (30.27) 


2 Int In” x 


Similarly, (30.26) implied an inequality in the other direction. At this point, 
Chebyshev remarked that by a principle of differential calculus (now called |’ H6pital’s 
rule), the expression on the right-hand side of (30.27) could be made arbitrarily small 
as x became large, so that the result followed. He also remarked that this theorem 
determined that the limit of —— — In x as x went to infinity, was —1, contradicting 
Legendre, who predicted the limit would be —1.08366. 

Chebyshev’s proof of (30.26) was similar to his argument for (30.25). He first 
supposed (30.26) to hold for only a finite number of positive integers x so that there 
would be an integer a larger than e” and larger than the largest integer x for which 
(30.26) would hold. Then for x > a, 
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ie: 
ne) i, GE os SL ih (30.28) 
2 


Int ~ In*x’ Inx 


Chebyshev showed that in this case the series in (30.24) would diverge, a contradiction 
proving the result. To demonstrate this divergence, he used Abel’s summation 
by parts: 


s s 
= Ux (Vy4] — Vy) = Us Usp — UgVat1 — > Vy (Uy — Ux—1). 
x=a+l1 x=a+l1 
He took 
ey * dt In” s 
Vy = W(x) — — = ; 
as Int ee ae 
so that 


>: (x +1) (x) Pe dt \ In" x 
am ee x Int } x!+e 
s+1 dt \ Ins atl iat 
= (x 1 1) i 1) site (x 1 1) / ) mee 
Ss x dt In” x In” (x = 1) 
7 aX (= i “) (sa (x — <7) : (30.29) 


By the mean value theorem, 


In? x = In"(x - 1) _ ( n (14 »») In” (x — 0) 


xitp (x —Dite nie — 6) (x — 0)2+P’ 


where 0 < 6 < 1 and 6 depended on x. The sum (30.29) then took the form 
Ss a ; 
dt n In" (x — 6) 
ime ~ h(x _0)) (x ~ O)2tP" 30.30 
2. (n I 7) ( ae In(x — 5) (x — 0)2+0 ( ) 


Then for x > a, 


n n 
>1 ; 
In(x — 8) Ina 


l+p 


and by (30.28) 


* dt ax a(x — @) 
(x) / > > 


> Int ~ In" x ~ In"(x —- 0) 
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Observe that Chebyshev could derive the last inequality because ;7— was an 
increasing function. Therefore, he could see that the sum (30.30) was greater than 


S 
n 1 
1-—) ee 
a ( Ina Rae (x — 0)!+e 


Chebyshev thus arrived at a contradiction: when s — ov, he had the infinite series 
(30.24) diverging as p > 0. 

Chebyshev’s proof of the existence of the limit (30.21), his first theorem that we 
are discussing, made use of the formula found in Euler and Abel: 


(oe) e* CO 
i Sem x’dx = / oP he ee eae 
0 & 0 


CO love) CO 1 love) 
— Sy e"*x"dx = SS =a | e *x’dx. (30.31) 
m=2 0 m=2 9 


To show that the limit (30.21) existed, Chebyshev rewrote the sums contained in it 


as 
d 1 1 
dp (=m (1 <=) 8 =) 
P P 
d 1 cI 1 
ar (mo Dom( =) (>: =a :)) (30.32) 
m= 


and proved that each of the three expressions in parentheses was finite as o — 0. 
He proved the more general result for (30.22), by showing that the derivatives of 
those expressions also had finite limits. Using (30.31), Chebyshev rewrote the third 
expression in (30.32) as a ratio of two integrals: 


oo 1 _ So (4 i- 1) e*xPdx 
2a mit p ee e-*xPdx j 


(30.33) 


He noted that these integrals converged as p — 0 and that the derivatives of (30.33) 
contained expressions of the form 


/ ( ) erranaytay or / ex (Inx)kdx; 
0 ex —1 X 0 


these integrals also had finite limits as p — 0. To show that the middle expression in 


(30.32), 


Inp— Dn(i- =) 
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was finite as p —> 0, Chebyshev employed the Euler product 


=. af Nae Goa 
Yas l('-a5) 
m=1 p 


After taking the logarithm of both sides and adding In p to each side, he had 
inp - Yn(1- 5) =n((14 ot) 0] 
; pite = mp 
=In ( + p+ S : ) °) (30.34) 
ra +P 


2 


Noting the expression on the right-hand side, Chebyshev thus proved the existence 
of the limit of the left-hand side as well as the limits of all its derivatives, as p — 0. It 
is even simpler to show that the first expression in (30.32), and all its derivatives, have 
a finite limit as » — 0. This proves Chebyshev’s first theorem. 

At the end of his 1849 paper, Chebyshev followed Legendre in assuming the prime 
number theorem to prove that 


1 1 1 1 
t---+—=InInx +c, (30.35) 
x 


where x was a very large prime and c was finite. Chebyshev corrected the corre- 
sponding formula in Legendre, who had In(In x — 0.08366) on the right-hand side. 
Chebyshev also suggested a similar change in Legendre’s formula for the product 


( ') ( ') ( ') ( ‘) | ivy in Chebyshev, 
1 1 1 ae (i eee 
2 3 5 x Un 52008366) in Legendre. 


(30.36) 


In 1874, Mertens proved, without assuming the then-unproved PNT, that co = e7”, 
where y was Euler’s constant. Thus, Mertens’s result implies 


1 —l1 
I] (1 = -) =e’ Inx+O(I). (30.37) 
psx P 
An approximate value of e” is 1.781, whereas, presumably on numerical evidence, 
Gauss gave the value of the constant to be 1.874.7! 
30.3. Chebyshev’s Proof of Bertrand’s Conjecture 
In his second memoir on prime numbers,”* Chebyshev proved Bertrand’s conjecture, 


making effective and original use of Stirling’s approximation. In the course of his 


21 Gauss (1863-1927) vol. 10, p. 12. 
22 Chebyshev (1850). 
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discussion of the series for the logarithm of n!, he was led to define two related 
arithmetical functions 6(x) and w(x): 


ax)= Si np, vx)= > np, (30.38) 


PSX p"sx 
where p was prime. Chebyshev immediately noted the clear relation between the two: 
W(x) = O(x) + OVX) + O/H) HOS) Fee (30.39) 


Keeping in mind the preceding definitions and a result first noted by Legendre, 


an ae a + BLL 4... 
ni=[[p” Lp tgs (30.40) 


psn 


Chebyshev observed that if T(x) = In Lx]!, then 


T(x) = v(x) 4 ¥ (5) bu (=) te. (30.41) 


Next, he set a = |x] so that, by Stirling’s formula, 


1 1 
EG) = inal 27, talna—ad zina 


T (x)= In(a +1)! —In(a 4 1) > 5 In2n (a+Inat+1)—(at+) 5 Ina +1), 


Thus, 


1 1 1 1 
5 in2n +x In x x pe TO) 5 nen kink x+-—In x4 


From these bounds for T(x), Chebyshev obtained bounds for w(x). Of course, 
one might obtain an expression for w(x) in terms of T(x) from (30.41) by means of 
MObius inversion. However, Chebyshev chose to work with the sum 


ro-r(3)-0(3)-rG)7(S) 


He showed that when the value of T(x), taken from (30.41), was substituted in 
(30.43), the result was the alternating series 


W(x) v(2)+¥(3) (=) 
sG)-a)ee(R)—v(B)e= 0 
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He observed that, in general, the coefficient of W(=) would be 


+1if n= 30m +k, k =1,7,11, 13,17, 19,23, 29; (30.45) 

Oif n= 30m+k, k =2,3,4,5,8,9, 14, 16,21, 22, 25, 26, 27, 28; (30.46) 
—lifn=30m+k, k = 6,10, 12,15, 18, 20,24; (30.47) 
—1 if n = 30m + 30. (30.48) 


Note here that the series (30.44) was alternating and that the absolute values of terms 
were nonincreasing, making the sum of the series less than the first term and greater 
than the sum of the first two terms. Thus, Chebyshev could conclude that 


x x x x x 
vo -v(2) < T(x) (5) (5) (=) (=) < W(x). 


An application of the two inequalities (30.42) then yielded 


Ar 3inx~1< TQ) r(5) (5) (3) +1() <ax+ ins 


where 


a 


A = In (2233553035) = 0,92129202.... (30.49) 


In this way, Chebyshev obtained the two inequalities 
5 x ) 
w(x) > Ax — 5 Inx—1 and W(x)-w . < Ax+ 5 Inx. (30.50) 


The first inequality determined a lower bound for y (x); Chebyshev obtained an upper 
bound from the second inequality by employing an interesting trick. He set 


6 5 5 
— —Ax4 ge eel 
TNs BE Gag eg 


and by a simple calculation obtained 


x 


re) £(2) =Axt2inx, 


Therefore, by the second inequality 


i x x x 
w(x) -¥(=) SF) (=) or W(x) — f(x) < ¥(Z) -1(Z). 
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Replacing x by <, re ...+, a7 successively, he got 


WO) — f(x) <¥(z)-#(Z) <¥(5) - 1(3) uss 


Taking m to be the largest integer for which g = 1, Gyr Would have to lie between 


é and |. Therefore, 


and 
w(x) — fx) <1 or Wo) < f@) +1. 
Thus 
6 | 5 2) 4 5 | 
W(x) < got are x4 zinx t 1. (30.51) 


Chebyshev obtained bounds for 6(x) from those of w(x). He observed that (30.39) 
implied that 


w(x) — Wx) = O(x) + (Sx) + O(a) ++ 


W(x) — 2W(/x) < Ox) < WX) — W(x). (30.52) 
He concluded from the bounds for w(x) in (30.50) and (30.52) that 
i a 5 2 15 
2 

Ax 5 Ax Panta x mi Inx —3 < 0(x) 

6 1 5 9 5 

74 | | 
< 5 Ax Ax? 4 a6 x4 5 nx + 2. (30.53) 


With the help of these inequalities, Chebyshev was able to prove Bertrand’s 
conjecture. 

Chebyshev argued that if there were exactly m primes between the numbers / and L, 
then 0(L) — 6(/) could be expressed as the sum of the logarithms of these primes and 
hence 


min! < O(L) —6(1) < mInL (30.54) 


or 


6(L) — a1) O(L) — A(1) 


<m< (30.55) 
InL In/ 
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He denoted the upper and lower bounds of 6(x) in (30.53) by 6;(x) and 67;(x), 
respectively, and noted that by the last inequality, m was greater than k = 07;(L) — 
6,(1). Substituting the values of 07;;(L) and 6;(/) and solving for /, Chebyshev 
obtained 


5 t . O5in Eo: f 25 25 
ees ae a oy a iy Ae 
6 16AIn6 6A ( 4 ) aaa 7 Gre) 


and observed that between / and L there were more than k primes. He then took k = 0 
and saw that there had to be at least one prime between 


5 95 inf. - 125 25 
pane es Be : InL and L. (30.57) 
6 16Aln6 24A 6A 


Finally, he remarked that for L = 2a —3 anda > 160, the value of / in (30.57) was 
larger than a, and hence there was a prime between a and 2a — 3. Since the conjecture 
could be confirmed to hold for values of a < 160, this completed the proof. In 1919, 
Ramanujan published a similar but very brief proof of Bertrand’s conjecture. It may 
also be of interest to note that in 1932, when Paul Erdés was only eighteen years of 
age, he found a proof quite similar to Ramanujan’s. 

Chebyshev closed his paper by giving bounds for (x), the number of primes less 
than x. He derived these bounds as a corollary to an interesting theorem on series: 


Supposing that for large enough x, Fa) was positive and decreasing, the series 


HQ) PO) EO) PG) PU ce FOS) 


converged if and only if the series 


FQ) FG) FA FG) FO 


In2 In3 In4 In5 In6 


converged. Clearly, this theorem implied the convergence of the series )> sp’ In 
fact, Chebyshev showed that the sum of the series lay between 1.53 and 1.73. To prove 
this theorem, Chebyshev took a, 6, y, ..., to be prime numbers between the integers 
1 and L. Then he defined U by 


S= FO) FO) iO) et Pa) a RB) Py) ae ap) 
= So + F(@) + F(B) + F(y) +--+: + F(e) = So + U. 


Since 6(x) — 0(x — 1) = In x for prime x, and = 0 for composite x, Chebyshev 
could conclude that 


_ ODOC -V pay : Aocey aes OFT HL) 
In/ Ind + 1) 
6d +2)—-e0 +1) 6(L) — 0(L — 1) 


ind +2) FU+2)+---4 mL F(L). 


138 Distribution of Primes: Early Results 


He then applied summation by parts to obtain 
_ F(L) (FO — Fd +1) fFG+)D — FG@+2) 
ae (= In(/ >) on @ +1) Ind =) o« o 


e (fe Pee) oa) _ Fut) 
InL InL+h In(L +1) 


@(L). (30.58) 


He took / large enough that F(x) In x was positive and increasing in] —1 <x < 
L + 1 and obtained the inequalities 


E 


FO FO | Gi (x) — 811% — 1) 
be 1 ay 6;(1 — 1) ar 2 Fe) re <U 
Fil) F(L) | iz O7(x) — O7(x — 1) 
00-1) on 2 Fe) . (30.59) 
Chebyshev noted that 


O77 (x) — O77(x — 1) 
= SAE J/x—1) 


5 9 9 15 
ane In*(x — 1)) ae In(x — 1)) 


and that this expression was bounded as x — oo. For example, 


1 
1\2 1 
Ji ~Vam1= va-vi(I -) err: >0x > oO, 


and 


x-— 


1 1 
Inx —In(x — 1) =In =in(1-~) +0 as Xx > ©. 
x 


By a similar analysis with 67(x) — 07(x — 1), he noted that the two inequalities 
in (30.59) implied the theorem. Chebyshev obtained the bounds for (x) by taking 
F(x) = 1 and/ = 2 in (30.59): 


E 


Grrl) » Or.) y O11 (x) — 811% — 1) 


ine ane ee RY) 


x=2 


470) — 977) ys 87 (x) — 97(x — 1) 


In2 In2 In x 


x=? 


30.4 De Polignac’s Evaluation of Stee ae 1? 


30.4 De Polignac’s Evaluation of >’ ,<, me 


Inspired by the work of Chebyshev, in the 1850s, Alphonse de Polignac published a 
number of papers in the Comptes Rendus and Liouville’s Journal. Though his work 
was largely lacking in rigor, in 1857*? de Polignac gave a fairly good proof of 


1 
y= =Inx+e, 
psx 


where € was a quantity small compared to Inx. In fact, his proof implies that € is 
bounded. Now by Chebyshev’s work, 


ree-E((s)-Lal+Lae~)oe 


p” <x 


De Polignac denoted the left-hand side by In F,(x) and ial by E (+). He let n be 
the largest integer such that p” < x. Then 


and 


Therefore, 


oe = P"(p ea 
In In 
mx a ee eins 
ae pa ee 


De Polignac then argued that }° pr<x Np = en<x Inn = x Inx + terms of smaller 


In p Inp Inp 
order, and )/ 5~7 was of the same order as )/ —". Moreover, }/,<x Gap Was 
bounded, so that the required result followed from the two inequalities. In 1874, Franz 
Mertens, aware of de Polignac’s work, but motivated by Chebyshev’s second paper, 


proved that 


T 
y. — =nx + O10). 


PSX 


23 Polignac (1857). 
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-1 
30.5 Mertens’s Evaluation of ||, —, (1 rad 7) 


According to Mertens, his interest in evaluating Ge oa (1 — *) arose from the 


useful formulas he had seen in the third edition of Legendre’s Théorie des 
nombres.*4 Legendre’s formulas stated without rigorous proof that for some constants 
A and C, 


1 
\ * — = In(inG — 0.08366) + C, 
psG 


and 


TT (1 1\ A 
p)  InG — 0.08366" 


psG 


Mertens proved the results: First, that 


1 
y\ —=InnG+y-H+6, (30.60) 
p<G 
where 
ss 1 
H=) 05) oe 
k=2 °° —p 
and 
4 2 
< : 
In(G+1) GinG 
Secondly, 
ty : 
I] (1 = -) =e’t? InG, 
p<G P 
where 


4 2 1 


/ | 


O<= ; 
In(G+1) GIiInG 2G 


He began by observing that Dirichlet and Chebyshev had shown that for p > 0, 


1 
ea ee 
p 


24 Mertens (1874a) and (1874b). 
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where (¢) denoted a quantity tending to 0 as p — 0. It followed from this and Euler’s 
product for ¢(1 + p) that 


In~ + (p) =Ing(1+ p) =~ Yin — 1/p"*”) 
P 


p 
1 1 1 1 1 
=> l+p » 2429 2»; 3+3p 
aa 5 pe ad 
Mertens could then easily conclude that 
DS eee (p) (30.61) 
pite p . : 


He proceeded to complete the proof of (30.60) by showing that 


1 1 
De Ge ne ee) (30.62) 
p>G 


where 


4 ae) 
In(G+1)° GiInG’ 


|5| < 


He set f(x) = >> eet me Then summation by parts gave him 


_ fG) _< 1 1 
~~ (G+D)oInG+) _ eee” (= es): 
(30.63) 


Next set f(7) = Inn + D,. Recall that de Polignac had shown that D, was small 
compared to Inn, but Mertens required that D, be bounded. This was easily achieved. 
Mertens showed that D,, < 2 by computing bounds explicitly, as Chebyshev had done 
in his work on primes. Now observe that 


1 
1 1 1 1 In (1 os at) 
n 
‘s nPinn (n+1)?In(n+ 1) nP (n+1)P (n+1)/ Inn + 1) 
1 1 1 


ne (n+1)@) (n+1)!*/ In~ + 1) 
Xr 
2n(n + 1)!+? In(n + 1)’ 
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where 0 < A < 1. Applying this in (30.63) and after cancellation of terms, he had 


ae . a2 R 30.64 
Dera om cee ; ns) 
p>G n=G+1 
where 
_ In(G +1)— f@) 1 ae 3 1 
~ (G4+hemG+h (G+lh*mG+h | 


1+ 
are 2n(n + 1)'T? In(n +1) 


= 1 1 
* Pz Pn (— ee): 


n=G+l1 


It is easy to show that R=O (2) and, in fact, Mertens proved that |R| < 


NGED oo cus: To estimate the sum )°> 4. first note that 
[o,@) 
1 G" 4 
Dye ge (30.65) 
n=G+1 
where 
tee ed VAROAT: SR A 
(fe | hate 
, 2 a nett | 2.3 ~ vote 
n=G+1 n=G+1 
To prove (30.65), observe that the binomial expansion of (1 — a) immediately 
implies 
1 1 = 1 itt 1 JOFrOO+2) 1 ; 
in’ ttn+ly¥ (n+l)! 2) (n 41)? 2.3 (nt+13te 0° 
The required result followed when this formula was summed from n = G to 


n = oo. Now integrating (30.65) from p to 1, obtain 


lee) ioe) 1 —t 1 
1 1 G 
ae = dt — R' at 
a n+? Inn 2 n2 Inn i, t / 
n=G+1 n=G+1 


i dx te ( 1 —)a 
= = = XxX 
ping eA pinG iad | x 

oo Gx I 

/ dx [ Ra. 

1 x p 


The first integral in this expression could be written as 


fore) e7* 00 
/ dx =In(1 —e”*) 
p pinG 


ingle 


= —In(1 — G~?); 
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the second one, by Gauss’s formula for w(1) = To = —y, given in Exercise 17.11, 
would be 


lee) pinG 1 ew* 
/ -| (= 1 —) dv=y +0. 
0 0 aa 


Also 
7 1 
—Ind — GG’) =In— —InInG + (p), 
p 
so that 
(oe) (oe) 
1 1 ee dx 1 
= 1 InInG G *— —— 
Ds n'+P Inn a ae e i x Ds n7Inn 
n=G+1 n=G+1 
1 
-| R’ dt + (p). 
p 
Next, 
1 fore) 1 fore) 
/ 
eee | Se eh abe a 
n=G+1 n=G+1 
(oe) CO 
1 1 1 1 
: as er aun) ye (an aan) 
n=G+1 n=G+1 
(oe) CO 
1 1 1 1 
< Lan< Dina —1 aa) < GEG 
ne nn poe (n—1)In(2~—1)— ninn n 
and 
eGo & 1 
/ dx < : 
1 x GinG 
Hence, 
3 z afte InInG + : + (p) 
nit+Pinn —p a eae ee 
n=G+l1 


When combined with (30.64), this gave (30.62). Mertens’s version of Legendre’s 
formula for >> p26 ‘ followed from this, (30.62), and (30.61). Mertens’s formula for 
the product was an easy corollary. 

In 1926, Hardy commented on Mertens’s proof:*> “The proof is rather difficult to 
seize or to remember, since it depends on a combination of the method of Tchebycheff 


25 Hardy (1966-1979) vol. 2, pp. 210-212, especially p. 210. 
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on the one hand and the theory of Dirichlet’s series on the other, and it may be 
worth while to give an alternative proof.’ Hardy himself provided two proofs, the 
first published in 192776 using an integral analog of Littlewood’s Tauberian theorem, 
and the second in 19357’ using the analog of the simpler Tauber’s theorem. Hardy’s 
proofs are quite interesting, but we note that if Mertens’s proof is recast in terms of 
the Stieltjes integral, a very simple proof of (30.62) emerges; note that the latter is the 
only complex argument in the proof. 
To begin this short proof, first observe that 


1 CO 
—_ = x? Inx df (x), 
ss pre te 


p>G+1 


where f(x) = Dee ™P — Inx +e, |e] < 3. Then integration by parts and an easy 
calculation produce 


1 ie eeu 1 
>: a = +6) 7 du 4 o(-5). 


p>G+l P In(G+1) 


Next, again by Gauss’s formula for I’’(1) (see Exercise 17.11), this integral can be 
evaluated as 


le) p In(G+1) eX le) 1 
/ a ee ax 
In(G+1) * —1 x piney e*= 1 


=i l(t =(Gr-1)£) 


=y+(p) om + InIn(G + 1) + (p). 


This proves (30.62) so that the proof of Mertens’s formula can now be completed 
as before. 


30.6 Riemann’s Formula for z (x) 


Riemann’s eight-page paper of 1859,78 containing his formula for the number of 
primes less than a given number x, was actually an outline of a research program for 
the advancement of the theory of distribution of primes. He proved very few statements 
in this paper, but clearly set forth his conjectures and how some of them they might 
be verified. It took fifty years of development in complex analysis to prove the first 
approximation of his formula, the prime number theorem. Almost a century after 
Riemann’s paper appeared, Hardy’s student and Oxford professor Edward Titchmarsh 
wrote, “The memoir in which Riemann first considered the zeta-function has become 


26 ibid. 
27 ibid. pp. 230-233. 
28 Riemann (1859). 
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famous for the number of ideas it contains which have since proved fruitful, and it is 
by no means certain that these are even now exhausted.””? 

Recall that Dirichlet and Chebyshev employed Mellin transforms to study 
prime numbers, but that they limited themselves to real variables. Riemann’s great 
innovation was to employ complex variables. He expressed the Mellin transform of 
his arithmetic function f(x) defined by (30.14) in terms of the zeta function; then by 
Mellin inversion, he expressed f(x) as an integral in the complex plane. This made it 
possible to apply the powerful machinery of complex integration. Riemann observed 


that if 
ee) lee) 
p? = sf x 5—!dx, i = f x ldx, as 
Pp p? 


were used in 
‘ _, 1 ‘ 1 Z 
log ¢(s) = — } log — pp") = pS +5 op +5 pet, 
P D Dp p 

he got 

l 00 

ae) =| f(x)x*ldx, Res >1. 

s 1 


He then applied the Fourier inversion formula to obtain an integral expression for 


f (x): 


a+ooi 
ron= xe f OBE rg a>l. 
201i Ja—ooi Ss 
Riemann set 
S aa) 
£(s) = (5) (s — In 2L(s), (30.66) 


and by using (30.17), he obtained 


s s s 
logg = 5 log x — log(s — 1) — log II (5) + dlog (1 = =) + log &(0). 


Riemann noted, however, that when this expression was used in the integral for 
f(), the integral became divergent. So he applied integration by parts to get 


1 jee d(ss)) 


~ Ini log x rl ds 


f(s) = x*ds. 


29 Titchmarsh and Heath-Brown (1986) p. 254. 
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Next he observed that 


m 
s : Ss Ss 
—log I (5) = jam ( ) log (1 + =) ray em) 


n=1 


and therefore 


Ss 
d log TI($) _ 3 q 8 ¢ = s) 
dss = ds : 


S 
A=1 


Hence, every term in the expression for f(s), except for the term 


1 1 at+ooi 1 
ae / — log €(0)x*ds = log (0), 
2mi logx Ja—ooi SS 


took the form 


1 1 stoi g flog(l—- | 
oa Peis, 
2mi logx Jg—ooj ds Ss 


To evaluate this integral, Riemann observed that 


d log (1 — 3) ~ 1 
dp s ~ (B—s)p 


Thus, for Re(s — 6) > 0, he had 


i Ps, j 
th pee 5 1 ae xsds_ xP 
a 


Qmi dB In-coi 5 Oni decor (B-9B B 


ee t?-ldt, when Re B <0, 
. fo t?-'dt, when Re B > 0. 


Riemann could then conclude that 


Me 1 jaa d log = z) Joy 
2mi logx Jao ds Ss 


1 toot log(1 — 4) 
eRe ESS 


271i Ja—ooi Ss 


x ¢b-1 
shee Tog hs when Re 6 < 0, 
x ¢B-1 

0 Tog At when Re 6 > 0. 
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Note that it was clear that 


x pol x8 
—__dx = / Be ey. 
9 logt 0 logu 


By using these results in the expression for f(x), Riemann obtained his famous 
formula 


; : Lode cares | dt 
f(x) ee ae By) / PS en + log &(0). 


In writing this formula, Riemann assumed the truth of the Riemann hypothesis. He 
wrote B = 5 +ia,andl1—fp= 5 — ia, so that the expression in the sum appeared as 


ii (x24) i (x27) 


One may verify that the integral evaluations as sketched by Riemann are indeed 
correct; one may also consult Harold Edwards’s book, offering a detailed discussion 
of Riemann’s paper.*” 


30.7 Exercises 
(1) Using Chebyshev’s notation, show that for T(x) = InLx]!—21n [>| ! 


Xx 


v@)-¥ (5) <T@) <¥@-¥(5)+¥(5). 


Apply Stirling’s approximation to prove that T(x) < ae for x > Oand T(x) > 
a for x > 300. Use this to show that w(x) < 3E, Now show that w(x) — 
2W(./x) < A(x) < w(x) and, therefore, 


va) —v (5) +4 (5) <0 +20WH-0(5)+¥ (5) — 60.67) 


< (x) —0 (5 


) +5 +3ve. (30.68) 
Apply these results to show that 
0(x)-0(5)>2—3v¥ for x > 300, 


Show that this proves Bertrand’s conjecture for x > 162. Finally, show that 


% _ 3/3 
u(x) — 1 (5) 8 ie for x > 300. 


See Ramanujan (2000) pp. 208-209. 


30 Edwards (2001). 
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(2) Let C(m,n) denote the binomial coefficient m choose n, that is, the number of 
ways of choosing n objects out of m distinct objects. Show that the exponent of 
a prime p in C(2n,n) is given by 


Show that 


‘=| F|-2Lz] =! 

pk ei” 

that d = 1 for /2n < p < 2n, and that d = 0 for p > 2n and for 2A <p<n. 
Use these results to conclude that 


CQnn)< [] @n) J] Pp II 2 


p< 2n V2n<p<*# n<pS2n 


Note that if Bertrand’s conjecture is false for some n, then 


CQn,n) < 2ny¥" TT p. (30.69) 
V2n<p<*# 


(3) Show that 2nC(2n,n) > 4”. Prove that for a > 5, C(2a,a) < 4¢~! and that 
Taep<2a P < C(2a,a). Use the last two inequalities to show that []j9—,<, 
p <4". Combine these results with (30.69) to show that if Bertrand’s conjecture 
is false then we get a contradiction. See Erdés (1932). Paul Erdés (1913) 
founded many aspects of combinatorics and popularized this area of mathemat- 
ics by continuously traveling all over the world and collaborating with hundreds 
of mathematicians. 


(4) Let (1) = 1 and let d(n), n > 1, be the number of numbers less than n and 
prime to n. Let F(t) = ee P(t). Prove that F(t) = a + O(tInt). See 
Mertens (1874b) pp. 290-292. 


(5) Show that the integral '(s) = de x°—le-* dx has the inversion 


1 a+ioo 
et = —— T(x)x ‘ds, a>0O,Rex>0. 
271 Ja—ioo 
See Cahen (1894). 


(6) Prove Ramanujan’s formula 


in(2mtx)dx = 
sinh mx Sunes) aX 2 sinh rt 


[ cos 1x2 cosh zt — cos mt” 
0 
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Use this formula to show that for —3 < Res < 4, we have 


T(s) 
2s 


OP pssst rr i sin? (=) 
——_ _ r = “Sys d 
[ (5) oe a * sinhx 


See Ramanujan (2000) p. 64, for his formula. For the other formula, see 
Mustafy (1966). 
Show that for &(s) defined by (30.66) and for 0 < Res < 1,A > 0, 


(1 — 2°)(1 — 2'5)g(s) 


(7 


—, 


1 
7 1 Ss Ss 
[ u-2k(A, 2) (un? + ui-*) du = 5 BUS)E(S) (Co +2475) ; 


where sa" ‘n )r(S). 


4 2 2 


[oe] 2, 
P(A,u) = (fer gates ) x dx 
0 


e2mx = 1? 


ee _ (a! a-4) ud(A,u2). 


Prove also that k(A,u) > 0 for 0 < u < 1, and that a number sp = op + if 
with 0 < o9 < 1 is a zero of &(s) if and only if for every A > 0 


Pe il yeti, Sih 
/ u_2k(A,u) (we 2+ U2 ) du =0. 
0 


See Mustafy (1972). Ashoke Kumar Mustafy had a thirty-year career in the 
Indian Administrative Service, including as Vice Chancellor of Lucknow Uni- 
versity during 1973-75. In spite of his heavy administrative duties, he worked 
on mathematics six to seven hours per day and had time to discuss mathematics 
with a young boy like the author. Mustafy hoped that his result would be useful 
in proving the Riemann hypothesis; indeed, in this connection he communi- 
cated with André Weil who wrote that he found Mustafy’s work promising. 


30.8 Notes on the Literature 


See Smith (1959) pp. 127-148, for an English translation of some parts of 
Chebyshev’s two papers on primes. Delone (2005), an English translation by R. Burns 
of Delone’s Russian original of 1947, gives a detailed commentary on Chebyshev’s 
papers and a discussion of the major contributions to number theory of St. Petersburg 
mathematicians in the period 1847-1947. Edwards (2001) presents a detailed and 
fascinating discussion of Riemann’s 1859 paper and some of its consequences. 
Narkiewicz (2000) offers an excellent exposition of the development of the prime 
number theorem and provides a comprehensive list of references. 
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Invariant Theory: Cayley and Sylvester 


31.1 Preliminary Remarks 


The invariant theory of forms, with forms defined as homogeneous polynomials in 
several variables, was developed extensively in the nineteenth century as an important 
branch of algebra but with very close connections to algebraic geometry. Several ideas 
and methods of invariant theory were influential in diverse areas of mathematics: 
topics as concrete as enumerative combinatorics and the theory of partitions and as 
general as twentieth-century abstract commutative algebra. 

George Boole, the highly original British mathematician, may be taken as the 
founder of invariant theory, though early examples of the use of invariance can be 
found in the works of Lagrange, Laplace, and Gauss. Boole had almost no formal 
training in mathematics, but he carefully studied the work of great mathematicians, 
including Newton, Lagrange, and Laplace. In a paper on analytic geometry written in 
1839,! Boole took the first tentative steps toward the idea of invariance, but he gave 
a clearly formulated definition in his 1841 “Exposition of a General Theory of Linear 
Transformations.”* He wrote that he found his inspiration in Lagrange’s researches on 
the rotation of rigid bodies, contained in the 1788 Mécanique analytique. Lagrange’s 
result is most economically described in terms of matrices, a concept developed in 
the 1850s by Cayley. In modern terms, Lagrange’s problem was to diagonalize a 
3 x 3 symmetric matrix A; Lagrange expressed this in terms of binary quadratic forms. 
Given a quadratic form x’ Ax, with x a three vector, the problem would be to find a 
matrix P such that P P‘ = J, the identity matrix, and P’ A P is a diagonal matrix. This 
means that if x1,x2,x3 are the components of x, y1, y2, y3 of y = P'x, and A1,A2,A3 
are the diagonal entries in the diagonal matrix, then 


x' Ax = Ary? + Aays + Asay}, (31.1) 
xp +45 +33 = yp +5 +93. (31.2) 
! Boole (1939). 
2 Boole (1941). 
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It is not surprising that this result of Lagrange also served as the starting point 
of the spectral theory of matrices. Cauchy, Weierstrass, and Frobenius were the 
primary developers of this aspect of matrix theory. But Boole took a different turn; 
he considered a homogeneous polynomial of degree n in m variables and applied a 
linear transformation to the variables to obtain a new homogeneous polynomial of 
degree n in m variables. He wished to determine the relations between the coefficients 
of the two polynomials. Boole’s method may perhaps be best understood by studying 
his simplest example. Let Q = ax? + 2bx1x2+ xs be a binary quadratic form. Set its 
two partial derivatives equal to zero and then eliminate the variables x; and x2. Thus, 


2ax; +2bx2=0 and 2bx; + 2cx. =0. (31.3) 
Elimination of the variables x1, x2 gives 
0(0) =b* —ac =0. (31.4) 
Now apply the linear transformation 


xy=pyitqy2 x2=7ryit+Syo, (31.5) 


where p, q, r, s are real numbers with ps — gr 4 0, to get a new quadratic form 
R= Ay; + 2Byiy2+C ye. A calculation similar to the previous one gives 0(R) = 
B* — AC. Boole pointed out that 


0(R) = (ps — qr)’ 0(Q); (31.6) 


the quantity ps — qr is the determinant of the linear transformation (31.5). In addition, 
the degrees of the homogeneous polynomials 6(Q) and 6@(R) are defined as 
equal to the degree of each term, in this case 2. 

More generally, Boole showed that, with Q, a homogeneous polynomial of degree 
n in m variables, if R, was the polynomial obtained after the application to Q, of a 
linear transformation with determinant EF, and if 9(Q,,) and 6(R,,) were obtained by 
the elimination process described earlier, then 


(Ry) = Em 0(Qn). (31.7) 


Here y represented the degree of 0(R,,) and 0(Q,,). In the 1841 paper, Boole stated but 
did not prove this theorem, though he gave a few examples to illustrate it. He indicated 
a proof in a paper appearing four years later.> Note that the polynomial 6(Q,) is 
termed an invariant because it satisfies the relation (31.7). Sylvester introduced the 
term invariant in a long paper on the subject published in 1853, and he coined many 
other terms used in invariant theory. 

At the end of the second part of his 1841 paper, Boole wrote that mathematicians 
should find invariant theory a fertile area for research and discovery. Indeed, Boole’s 
paper had an immediate impact on Cayley who, upon reading it in 1844, wrote to 


3 Boole (1845). 
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Boole of his enthusiasm for this new area of mathematics.+ Cayley was then a recent 
graduate of Cambridge University and had published an 1843 paper on determinants” 
in which he introduced the concept of hyperdeterminants or multidimensional deter- 
minants. In a paper of 1845, “On the Theory of Linear Transformations,” Cayley 
applied these hyperdeterminants to generate new invariants. Cayley’s work arose out 
of his efforts to generalize some well-known results. For example, the invariant ac—b* 
for the binary quadratic was known to be the determinant 


a b 
bec 


’ 


while the invariant 


abc +2 fgh — ah? — bg? — cf? 
for the ternary quadratic 
ax? + bx5 + xe +2 fxjx2 + 2¢x1x3 + 2hx2x3 


was the determinant 


(31.8) 


a SK & 
ro 
Qa so 


The first fact was already contained in Boole; Cayley presented the second invariant 
in his paper. As an example of the role of hyperdeterminants, so named by Cayley in 
1845, he considered the multilinear form 


Yo aint Xi V j2kWI, 


where the indices i, j, k, 1 assumed only the values | and 2. Each of the four pairs of 
variables (x1,x2), (1, y2), (Z1, 22), (w1, w2) could then be linearly transformed by 2x2 
matrices. So the multilinear form corresponded to a2 x 2 x 2 x 2 matrix and Cayley 
used hyperdeterminants to compute an invariant for this form. He then specialized the 
multilinear form by setting xy = yy = Zz) = wy = x and x2 = yo = 72 = w2 = y; he 
then identified the coefficients to get the binary quartic 


u=axt+ Abx*y + 6cx7y? + Adxy* + ey’, (31.9) 


where a = 01111, b = @2111 = 1211 = 1121 = 1112, and so on. By making a similar 
identification in the invariant for the multilinear form, he obtained the second-degree 
invariant for the binary quartic: 


I; = ae — 4bd + 3c’. (31.10) 


4 Crilly (2006) p. 86. 
5 Cayley (1843). 
® Cayley (1845b). 
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Cayley realized that his result was different from Boole’s invariant 6(u). He 
communicated his result to Boole, who pointed out that there was also an invariant 
of the third degree, given by Cayley as:’ 


In = ace — b’e — ad* — c? + 2bed. 


Cayley in turn showed that his invariant as well as Boole’s third-order invariant 
could most easily be derived by a method Boole had indicated in his very first paper, 
written in 1839. Boole then informed Cayley of yet another result, obtained by trial 
and error: 


O(u) = 1? — 279, (31.11) 
showing that the three invariants @(u), I 5 ; LB were linearly dependent. Cayley was 
intrigued by this result and computed invariants with still greater fervor, though by 
means of new methods.® 

Boole soon abandoned invariant theory in favor of analysis and logic; because 
of the unwieldy computational difficulties of hyperdeterminants, Cayley also gave 
up using them to find invariants. Perhaps surprisingly, Gelfand, Kapranov, and 
Zelevinsky rediscovered and promoted the study of hyperdeterminants. In their 1994 
Discriminants, Resultants, and Multidimensional Determinants,? they wrote that 
although hyperdeterminants had been largely abandoned for 150 years, they found 
them to be important in their attempt to construct a general theory of hypergeometric 
functions in several variables. 

However, the relation (31.11) suggested to Cayley and Sylvester an important 
problem, and they began work on it in the 1850s: Determine invariants 1), Jo,..., Js 
of a binary quantic such that all other invariants would be of the form P (11, Jz, ..., Is), 
for some polynomial P. The English mathematicians Arthur Cayley (1821-1895) 
and J. J. Sylvester were mathematical friends, reminding us of Euler and Goldbach 
before them and Hardy and Littlewood after them. Cayley and Sylvester met in 1847 
as law students; they remained close friends for almost fifty years until Cayley’s 
death, meeting as frequently as possible and exchanging hundreds of letters and 
hand-delivered notes. Both algebraists, they often worked simultaneously on the same 
topic. One may ask why they published no joint work. First, Cayley was a reserved 
and reticent person, while Sylvester was extremely ebullient and volatile. Moreover, 
Sylvester exhibited a strong need to maintain strict mathematical priority, both for 
himself and others. For example, in 1882 Sylvester wrote a paper on partitions, divided 
into anumber of distinct sections, each with its own heading and authorship, indicating 
whether that portion of the argument should be credited to himself or to his student 
Franklin. In spite of the apparent separateness of their work, Cayley and Sylvester’s 
mutual support and motivation surely led each of them to more progress than they 


7 ibid. pp. 205-206. 
8 Cayley (1846) p. 104. 
9 Gelfand, Kapranov, Zelevinsky (1994). 


154 Invariant Theory: Cayley and Sylvester 


might have achieved separately. E. T. Bell aptly labeled Cayley and Sylvester the 
invariant twins;!° we remark that they must have been fraternal twins. 

In order to look at the work of Cayley and Sylvester after 1850, we give some 
definitions in slightly modernized form, largely following Hilbert’s notation, presented 
in his 1897 lectures.!' Cayley and Sylvester worked primarily on invariants of binary 
quantics. These are polynomials in two variables, of the form 


n 


n 
f (%1,%2) = agxy + (‘aust + c 


Jax 2 e+ yxy. (31.12) 


Suppose the linear transformation (31.5) with determinant 6 = ps — gr 4 0 converts 
Ff (%1,x2) into 


n 7 n = 

Aoyi + (iat oe (;) at “Ya tet nye Or) 
An invariant J of f(x1,x2) is then a polynomial in the coefficients ao, a1, ..., Gn, 

denoted by / (ap, a1, ...,@n), Such that for some integer p 
I(Ao, Al, .--,;An) = 6? I (a0, 41, .--54n)s (31.14) 
where Ag, Aj,..., Ay are given by (31.13). Next, a covariant of f(x1,x2), denoted 
by C(a0, a1, ...,4n,X1,X2), is defined as a polynomial in ao, a1,..., a, and in x1, x2, 

such that 

C(Ag, Al, shee: »An, V1, 2) = 6?C(ao, a1, mS An, X1,X2), (31.15) 


with Ag, A2,..., An again defined by (31.13). 
Within this notation, the invariant of the quadratic form is ay —agaz; for the quartic 
form, the invariants mentioned earlier would be 


I, = aga4 — 4aja3 + 3a5 and Ip = aga2a4 aya4 aga3 a3 + 2a, a2a3. 


Two of these invariants are homogeneous polynomials of degree 2 and the third is 
of degree 3. If the coefficient ag is assigned a weight k, then the weight of each term 
in ay — aga can be given the value 2 by adding the weights in each product. Thus, this 
invariant is said to be of weight 2. Similarly, the weights of other two invariants are 
4 and 6. Note also that the invariant ay — aga is the discriminant of the quadratic form 
while i = OTT; is the discriminant of the quartic form. In a similar way, the cubic 
form discriminant given by 


aay 3a;a5 4a}as 4aga3 6aga}a2a3 


is an invariant of that form, of degree 4 and weight 6. 


10 Bell (1937), chapter 21. 
11 For the English translation, see Hilbert (1993). 
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In his 1854 paper, !* “An Introductory Memoir upon Quantics,” Cayley showed that 
all invariants of (31.12) were homogeneous polynomials of a given degree, say 0, and 
weight p, identical to the integer in equation (31.14), and that the relation of these 
quantities was determined by 


nO = 2p. (31.16) 


Indeed, this paper included a similar result for covariants. Cayley also found a com- 
putationally simpler way of generating invariants by means of differential operators. 
Interestingly, Cayley later noted that as early as the 1840s, he had observed that 


a 0 
+ 2 =0, 
(« ob a ac) =0 


and this then led him to consider such operators even in connection with his researches 
on hyperdeterminants. So in 1854, Cayley defined the two operators 


0 0 0 0 
Q = + 2: + 3 fees = ; 31.17 
i day "7 daz me 0a3 on "8a ( 
) 0 0 0 
O= 1 2 poeesvee 31.18 
aa (n — l)az Adi (n — 2)a3 a cae ( ) 
He showed that an invariant J of (31.12) satisfied the equations 

Q1=0 and O1=0. (31.19) 


In fact, a seminvariant is defined as a homogeneous and isobaric (each term of the 
same weight) polynomial S satisfying QS = 0. 

Sylvester also conceived of the idea of the differential operator and published it 
before Cayley in an 1852 paper, “On the Principles of the Calculus of Forms.”!? In 
this paper, Sylvester noted that he had discovered a simple derivation of (31.19) after 
Cayley had communicated that result to him. Sylvester also remarked that the German 
mathematician Siegfried Aronhold, “as I collect from private information, was the first 
to think of the application of this method to the subject.”!4 

Cayley and Sylvester each proved that every seminvariant /(ao,q1,...,d,) of 
degree 6 and weight p would be an invariant under the condition n@ = 2p. They 
generalized this to covariants and it became their favorite method of producing invari- 
ants and covariants, of various degrees and weights. Cayley’s “Second Memoir upon 
Quantics”!> gave a combinatorial method for computing the number of invariants of 
degree 6 and weight p by solving the equation 


QS (a0, 41, -.-,4n) = SY) Ghai, Qab at . -akn =0. (31.20) 


12 Cayley (1854). 

13 Sylvester (1852). 

14 Sylvester (1973) vol. 1, pp. 351-352. 
'5 Cayley (1855). 
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Now the number of terms of the form ape at" vee ake that are homogeneous of degree 
@ and weight p is equal to the number of nonnegative integer solutions of the two 


equations: 


ko thy t:::+kn =9, (31.21) 
ky + Qko bs nk, Sp: (31.22) 


Let w, (6, p) denote this number. When the Q operator is applied, we get terms of 
degree @ and weight p — 1. So equation (31.20) consists of w,(6, p — 1) equations 
in @,(0, p) variables. Cayley conjectured that the equations were independent and 
proceeded on this certainty. With this assumption, he was able to prove that the number 
of invariants of degree @ and weight p for a form of f(x1,x2) of degree n = “Pp would 
be given by a@n(9, p) — @n(O, p — 1). 

Observe that the number of solutions of equation (31.22) is the number of partitions 
of p, where each part is at most n. This connection between the number of invariants 
and the number of partitions probably led Cayley and Sylvester in the mid-1850s 
to investigate partitions; Sylvester gave a course of lectures on the subject in 1857. 
Interestingly, in 1878 during his later career at Johns Hopkins, Sylvester was able to 
prove Cayley’s conjecture of independence,!® while he was again working intensely 
on partitions with his students. This proof implied that w, (6, p) was the coefficient of 
x? in the Gaussian polynomial 


(i=) G9) 1-2") 


(=a 27) Sa) (31.23) 


We observe parenthetically this in turn implies that the Gaussian polynomial is 


unimodal; recall that a polynomial ap + aix 4 anx? +-++ + a,x" is called unimodal 
if there exists an integer m <n such that 
ao < a) < a2 < +++ < Am = Am41 = Gm42 >°++ = Qn. (31.24) 


In his second memoir, Cayley also considered the problem of determining the 
fundamental invariants of a binary quantic. He presented the list of such invariants for 
quadratic through the sextic forms, but due to an error in reasoning he believed that 
binary forms of order seven and more did not have a finite basis, that is, that there did 
not exist a finite number of invariants J), Jo, ..., Zs; of a form such that every invariant 
of that form could be written as a polynomial in these s invariants. This mistake was 
not corrected until the German mathematician Paul Gordan proved in 1868 that the 
covariants, and hence also the invariants, of any binary quantic had a finite basis. !7 
This was later extended by Hilbert to covariants of m-ary quantics in a very important 
paper published in 1890.'8 In spite of repeated attempts, Sylvester and Cayley failed 
to prove Gordan’s theorem by their own methods. 


16 Sylvester (1878). 
17 Gordan (1868). 
18 Hilbert (1890). 
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It is very interesting to recognize the origins of German invariant theory in number 
theory and algebraic geometry. Since Fermat, binary quadratic forms had been studied 
in number theory. In the 1770s, in order to study such forms, Lagrange applied linear 
transformations such as in (31.5), except that he took the values p, g, r, and s to be 
integers. In this context, Gauss mentioned equation (31.6) in article 158 of his 1801 
Disquisitiones. He did not go beyond the observation that the determinant 6(R) of the 
form R divided by the determinant of the form 6(Q) was a square, (ps — qr)°. Then 
in 1844, while studying number theoretic properties of the binary cubic, Eisenstein 
found the invariant!? 


aay 3a;a5 4 4apa3 + daga; — 6apaya2a3. (31.25) 


In the same year, L. O. Hesse (1811-1874), a student of Jacobi, defined the 
important covariant 


9: 2 2 2 
aga (2) (31.26) 


Ox? ax4 OxX1x2 
for any binary quantic. He introduced this covariant in order to study critical points of 


curves. More generally, for any homogeneous polynomial f of degree m inn variables 
X1,X2,..., Xn, he defined the determinant 


a f 


OX; OX; 


(31.27) 


Also note that in 1841, Jacobi published an important paper on functional deter- 
minants, defining the Jacobian and drawing attention to this area of study. The 
determinant (31.27) is now called the Hessian, a name given by Sylvester; in 1949, 
Hesse’s student, Siegfried Aronhold (1819-1894), who also studied with Jacobi 
and Dirichlet, initiated the symbolic algebraic approach for studying invariants and 
covariants that characterized German invariant theory until Hilbert took it in a different 
direction in the 1880s. Clebsch and Gordan made use of Aronhold’s approach; Paul 
Gordan (1837-1912), who wrote his thesis in Berlin under Kummer and had as his 
only doctoral student the great Emmy Noether, mastered the symbolic method and 
thereby proved that the invariants and covariants of a binary quantic had a finite basis. 

The problem of extending Gordan’s result to forms in n variables was very 
difficult to tackle using the existing algorithmic methods. In 1890, Hilbert introduced 
new methods and solved the problem. Hilbert’s proof depended on his lemma 
concerning solutions of a system of linear Diophantine equations. He proved the 
existence of a finite number of solutions of a special kind. This approach lent his 
theorem a nonconstructive character. Thus, Gordan was said by Max Noether to have 
commented, “This is not mathematics; this is theology!” Three years later, Hilbert 
gave a different proof, dependent on what is now known as the Hilbert basis theorem; 
it has now been reformulated in terms of ideals: If 7 C K (x1, x2, ...,Xn) 1s any ideal 


19 Bisenstein (1975) vol. 1, pp. 1-3. 
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in the ring of polynomials in 1 variables with coefficients in the field K, then there 
exists a finite number of polynomials f), f2,..., fm in J such that for all f in J 


f =Aifit Aofot--:+Am fin (31.28) 


for some polynomials Ai, Az, ..., Am in the ring. With the development of 
the machinery of Grébner bases, Hilbert’s second method of proof has become 
computationally quite significant. 

Hilbert’s work became the foundation for the development of commutative algebra 
in the twentieth century; it paved the way leading toward the abstract point of view 
in algebra and to the recent computational methods of ideal theory. In the area of 
commutative algebra, the chess champion Emanuel Lasker (1868-1941) in 1905 
established the main facts behind the primary decomposition of ideals. Another impor- 
tant contributor to the theory of rings of polynomials was F. S. Macaulay (1862-1937), 
Littlewood’s teacher of mathematics at St. Paul’s School in London. Although he was 
an excellent mathematical researcher, Macaulay remained a secondary school teacher 
throughout his career. In a famous 1921 paper, “Idealtheorie in Ringbereichen,’ Emmy 
Noether pioneered the abstract approach to ring theory. It is interesting that while 
Gordan had an algorithmic and concrete approach to mathematics, his student became 
one of the founding stars of the abstract and conceptual approach to mathematics. 

Although we do not go into detail on the topic, it may be worthwhile to comment 
briefly on the origins and development of elimination theory. Invariant theorists and 
mathematicians working with rings of polynomials in several variables found the 
method of elimination useful in various contexts. Boole made liberal application of 
elimination theory to produce invariants, and the topic led to the development of 
several aspects of algebra and algebraic geometry. Remarkably, in the twenty-first 
century, Eric Feron, an aerospace engineer, saw fit to translate Etienne Bézout’s 1779 
book on elimination theory as applied to polynomials in several variables, Théorie 
géneral des équations algébriques. In 2006 Feron wrote in the translator’s foreword:2° 


Translating Bézout’s research centerpiece became necessary to me after attending an illuminating 
presentation made by Pablo Parrilo at MIT sometime around 2002. His presentation was devoted 
to polynomially constrained polynomial optimization via sum-of-square arguments. It was 
illuminating because much of sum-of-square optimization methods rely on (i) using polynomial 
multipliers, and (ii) considering the various monomials appearing in the polynomial expressions 
as independent variables, resulting in interesting algorithmic simplifications. Such was also 
Bézout’s approach when dealing with systems of polynomial equations. I decided I needed to 
investigate the matter in more detail, by reading Bézout’s work and writing the present translation. 


Etienne Bézout (1730-1783) became interested in mathematics by studying Euler, 
and he made many practical mathematical applications, including a six-volume 
course for the French artillery. His most original investigations involved the analysis 
of polynomial equations in many variables. Bézout’s theorem on the number of 
intersection points of two plane algebraic curves is a direct consequence of his 
researches. 


20 Bézout (2006). 
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Even before Bézout, elimination was used in the seventeenth and eighteenth cen- 
turies to derive the discriminant of a polynomial or the resultant of two polynomials. 
In fact, the word resultant was used to signify the result obtained after elimination. 
The resultant R(f,g) of two polynomials is a polynomial in the coefficients of f and 
g, and assuming that the coefficients of the highest powers of f and g do not vanish, 
then R(f,g) = Oif and only if f and g have a common root. In 1665-66, Newton 
calculated the resultant of two cubics by computing various symmetric functions of 
the roots, including sums of powers. He wrote out all the thirty-four terms of the 
resultant.*! In his published work on algebra, he eliminated the variable from the two 
equations by a different method. 

It is an interesting coincidence that at about this same time, Seki Takakazu (1642- 
1708) was also thinking about the problem of elimination. Seki was a pioneer in the 
development of algebra in Japan; he joined with his student, Takebe Katahiro, to lay 
the foundation of early Japanese mathematics, or Wasan. Around 1670, Seki presented 
a method of obtaining the resultant by using determinants.?* Given two polynomial 
equations of degree n, he first converted them into n equations, each of degree n — 1. 
He then applied a method that amounted to computing the n by n determinant so 
obtained. He explained the details of his method by taking small values of n, at least 
up ton = 4.73 And Zhu Shijie investigated resultants in the thirteenth century. As for 
determinants, Chinese mathematicians had earlier used them to solve simultaneous 
linear equations and the Japanese mathematicians of the seventeenth and eighteenth 
centuries were familiar with this aspect of Chinese algebra.*4 Seki’s method was 
rediscovered by Bézout. Consider their method for eliminating the x term from two 
cubics: 


5 eo ayx? + byx* + cx + dh, g= anx? + box* + cox + dd. 


The three quadratic polynomials obtained from f and g would be a2 f — ay g, (aax + 
by) f —(ayx+b1)g, and (ayx* +byx+c2) f —(ayx?+b1x+c1)g. The 3 x3 determinant 
formed by the coefficients of the three quadratics would then be the resultant. 

Euler and Lagrange and others contributed to elimination theory in the eighteenth 
century; in the nineteenth century, Sylvester and Cayley were deeply interested in the 
topic, especially for its connection with invariant theory. The resultant of two binary 
quantics, for example, was their simultaneous invariant. In 1840, Sylvester published 
“A Method of Determining by Mere Inspection the Derivatives from Two Equations 
of Any Degree,” giving the modern expression of the resultant of two polynomials of 
degrees m and n, respectively, as an m +n by m +n determinant. He explained the 
general rule and illustrated it by computing the 4 x 4 determinant obtained in the case 
of two quadratics. Since the computation of determinants is generally tedious, it is 


interesting to read the remark at the end of Sylvester’s paper: 


21 Newton (1967-1981) vol. 1, p. 518. 

22 Mikami (1914, 1974). 

23 Seki (1974). 

24 Mikami (1974). 

Sylvester (1973) vol. 1, pp. 54-57, especially p. 57. 
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Through the well-known ingenuity and kindly proferred help of a distinguished friend, I trust to be 
able to get a machine made for working Sturm’s theorem, and indeed all problems of derivation, 
after the method here expounded; on which subject I have a great deal more to say, than can be 
inferred from this or my preceding papers. 


The distinguished friend was surely Charles Babbage who at that time was developing 
his analytical engine to carry out repetitive numerical and algebraic calculations. 
Babbage was assisted in his endeavor by Ada Lovelace, the daughter of Lord Byron. 

Cayley published several papers on elimination theory, reworking and simplifying 
the methods of earlier writers but also making very original contributions. The 
comments of Gelfand, Kapranov, and Zelevinsky in this connection are worth noting. 
In their book,”° they write that in a short paper of 1848, Cayley “outlined a general 
method of writing down the resultant of several polynomials in several variables. We 
were very surprised to find that Cayley introduced in this note several fundamental 
concepts of homological algebra: complexes, exactness, Koszul complexes, and even 
the invariant now sometimes called the Whitehead torsion or Reidemeister-Franz 
torsion of an exact complex. The latter invariant is a natural generalization of the 
determinant of a square matrix (which itself was a recent discovery back in 1848), so 
we prefer to call it the determinant of a complex. Using this terminology, Cayley’s 
main result is that the resultant is the determinant of the Koszul complex.” 

Elimination theory suffered a decline as Emmy Noether’s abstract approach came 
to the forefront. Algebraic algorithms had to be reworked into this new context. Thus, 
in his 1946 book on the foundations of algebraic geometry,”’ André Weil constructed 
an abstract device intended to finally make elimination theory superfluous. However, 
algebraic equations in many variables are also studied by engineers, for whom the 
abstract approach is not ideal. Moreover, Shreeram Abhyankar, protesting Weil’s 
attempt to eliminate elimination theory,* pointed out that some useful mathematical 
information could be lost in a nonconstructive method. Weil might well have agreed, 
and this may be indicated by his exposition of Eisenstein and Kronecker’s work on the 
constructive development of elliptic functions. And so elimination theory continues to 
flourish. A renewed interest in finding efficient algorithms has produced new methods 
such as Grobner bases. 


31.2 Boole’s Derivation of an Invariant 


In his two-part paper published in 1841, “Exposition of a General Theory of Linear 
Transformations,” Boole argued that the concept of an invariant could be useful in 
algebra.”” He gave a method for the derivation of an invariant of a general form, 
of degree n and in m variables. Although the method was not of great use in the 
further development of invariant theory, it is interesting to observe Boole’s originality 
in arriving at this important concept. We follow Boole closely; he supposed hy and 


26 Gelfand, Kapranov, and Zelevinsky (1993) p. 4. 
27 Weil (1946). 

28 Abhyankar (1976). 

29 Boole (1841). 
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H,, were nth degree homogeneous functions of m variables x1, x2,..., Xm expressible 
linearly in terms of m variables y1, y2,..., Ym. He also supposed that 


hy (1, X2, see Xm) = hi (v1, y2; ee »Ym)s 


(31.29) 
Ay (x1, X2, pig Xm) = Ay, (V1, Y2, she »Ym)s 


where h), and H/ were also homogeneous functions of degree n. In addition, Boole 
wrote these relations in the simple form 


q=r and Q=R, (31.30) 


respectively. He differentiated both sides of the second equation with respect to y1, y2, 
.+, Ym, and by means of the chain rule he got 


dQ Ox, dQ 0x2 dQ OXm oh OR 

x1 dy. 0x20 (OOXm AYA 

dQ Ox, dQ 0x2 dQ OXm _ OR 

Oxi Ay, | 9x2 9¥2 "BX DY2 — Ya’ (31.31) 
00 dx) 00 9x. 8O Axm _ AR 

Ox] Dym | 9X2 OYm es to D¥m. Ova. 


We note that Boole did not use the modern partial derivative notation; he wrote 


ge for ae and similarly for the other derivatives. Boole then assumed the linear 
L 


relationship 
Xy =Ayyi +A2y2 ++++ + AmyYm 
X2 = P1Y1 L2y2 ee UmYm 
(31.32) 
Xm = PLY + P2Y2 + +++ + PmYm; 
so that he could replace a _ ... by Ai, Ao, .... He argued that since the values 
M1, A2,...+, [1, 42,... were finite, the equations 
) 0 a 
9 _9 292 _9 ... 92 _o (31.33) 
Ox] 0x2 OXm 
implied that 
OR oR oR 
—=0, — =0,::-,— = (31.34) 
OyI dy2 d¥m 


He observed that, since the determinant of the linear transformation (31.32) could 
be zero, (31.34) did not imply (31.33). 
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Boole denoted by 6(Q) the expression obtained when the variables were eliminated 
from the polynomials 


dQ 0Q dQ 


Ox)’ Oxo” “OXm 


To eliminate x from two polynomials of degree n, he suggested the Euclidean 
algorithm. Initially, he had m polynomials in m variables. He could eliminate, for 
example, x; from the first two, and then from the second and third, and so on until 
there were m— 1 polynomials in m — 1 variables. A repetition of this method produced 
m — 2 polynomials in m — 2 variables. Ultimately, he had all the variables eliminated, 
obtaining an expression 6(Q) containing only the constants. Thus, if a = 0, 
i = 1,...,n, he had 6(Q) = O. Moreover, since (31.33) implied (31.34), he also 
had 6(R) = 0; and a similar relation of mutual dependence also existed between 0(q) 
and @(r). 

More generally, Boole combined the two relations in (31.30) into one relation of 
the form Q + hg = R + hr. In this case, if h was such that 


0(Q +hq) = 0, (31.35) 
then an analogous relation 
O(R+hr) =0 (31.36) 


would also be satisfied. Next, Boole let v be the number of terms in the homogeneous 
polynomials g, r, Q, R and denoted the coefficients in these polynomials by a, a2, 
..+5@y, bj, b2,..., by, Al, A2,..., Av, Bi, Bo,..., By, respectively. Then 6 would 
be a polynomial ¢ in v unknowns, and he could write 6(Q) = (Aj, A2,..., Av). 
Now from (31.35) and (31.36), Boole reasoned that for any 4 for which 


@(Ay + hay, Ar + hag,...,Ay +ha,) =0, (31.37) 
he must also have 


0(B, +hb,, Bo +hbo,...,B, +hb,) = 0. (31.38) 


The expression on the left-hand side of (31.37) was a polynomial in h where the 
term independent of h would be $(A}, A2,..., Ay) = 0(Q), and the coefficient of 
the highest power of 4 would be $(a1,a2,...,dn) = 9(q). If the polynomial was 
divided across by this coefficient, the resulting monic polynomial would be identical 
with the monic polynomial obtained after the same procedure was applied to the left- 
hand side of (31.38). Since the coefficients of the polynomials could also be seen as 
Taylor coefficients, Boole could deduce that 


0(Q) _ 0(R) 
6(q) = Ar) 


(31.39) 
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and 
a Ones ay" 0 
AFA, v 42745 Pee vr avy z Ay (Q) 
(4) 
r 
(bi gh + baghg tot baie) OR) 
= On) : (31.40) 
As an example, Boole noted the simple case 
ax” +2bxy +cy* =a'x” +.2b'x'y' + cy”, 
Ax? +2Bxy + Cy* = A'x”? + 2B’x'y' + Cy”. 
The results corresponding to (31.39) and (31.40) were 
AC— BB? A'C'—B” 
ac—b2 alc’ —b®’ op 
aC—2bB+cA_ a’C'—2b'B’+c'A’ 
ac — b* = a'c! — b? Ca) 


From (31.39) and at the conclusion of the first part of his paper, Boole arrived at 
the result that gave rise to algebraic invariant theory: 


a(Q) _4@) _ p 
OCR) O(r) 


Boole maintained that E could not depend on the coefficients in Q and R (or 
q and r). Thus, it must depend only on the coefficients appearing in the linear 
transformation (31.32). Boole wrote that he had found E to be an appropriate power 
of the determinant of the linear transformation (31.32), illustrating this by means of 
the binary quadratic and the cubic. He then went on to state the theorem contained in 
equation (31.7); this in turn led to the definition of an invariant. In a paper of 1844,°0 
Boole gave details of a proof and gave the value of y in (31.7) as m(n — 1)"~!, 

Near the end of the second part of his 1841 paper, Boole wrote that “Linear 
transformations have hitherto been chiefly applied to the purpose of taking away from 
a proposed homogeneous function, those terms which involve the products of the 
variables. ... [T]he transformations, besides being linear, are understood to represent 
a geometrical change of axes.” He went on to say that linear transformation could be 
applied to purely algebraic problems without geometric considerations. As an example 
he posed the problem: “To transform the function, ax? + 3bx*y + 3cxy* +dy?, to the 
form a’x’? + d’y’3, a’ and d’ being given, and the transformation unrestricted by any 
other condition than that of linearity.” 


30 Boole (1844a). 
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After solving this problem by means of his method, he applied it to the solution of 
a cubic equation with the comment, “The doctrine of linear transformations may be 
elegantly applied to the solution of algebraic equations.” In this connection, Cayley, 
well aware that a general quintic could not be solved in radicals, corresponded with 
Boole concerning the solution of a quintic and found that invariants could shed some 
light on their solution. 

It is interesting that in 1930, as a school boy of 16, Mark Kac solved the 
cubic?! by an independently discovered method similar to Boole’s solution. Kac 
wrote the cubic as a difference of two cubes, each of which was a linear function of 
the variable: 


xo + px+q = A(x +m)? — Bx tn)’. 


By equating the coefficients of x, he found that A = —*— and B = —*_ and that m 
and n were solutions of the quadratic equation 


yr+ hyip =o; 
P 3 
from this, he was able to derive Cardano’s formula. Kac’s paper was published in 
a Polish mathematics journal for students and because of this achievement he went 
on to become a mathematician. Luckily, the journal’s editor had been unaware of 
Boole’s work. 


31.3 Differential Operators of Cayley and Sylvester 


By the early 1850s, Cayley and Sylvester had discovered several elementary properties 
of invariants of binary quantics. They knew, for example, that these invariants 
were homogeneous and isobaric polynomials satisfying certain partial differential 
equations. They found these equations independently, though Sylvester was the first 
to publish them in 1852,** and they used them as important tools as their work in 
invariant theory progressed. We present Sylvester’s derivation of the partial differential 
operators, with a slight change in notation, especially in our use of subscripts. 
Following Sylvester closely, suppose that 


1 
@ = agx} + nayxt |x + 5 n(n — Dap x, e+ + anXx5 (31.43) 


is a binary quantic and that I (ao,a1,...,a@,) is an invariant of ¢@. To derive the 
differential equation, use the special linear transformation 


xp=ytey, x= (31.44) 


31 Kac (1987). 
32 Sylvester (1852). 
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to obtain the quantic 


n n 
ao(y1 +ey2)" + (‘ya (y bey2) ty2 tet (j,Jauco +ey2)"* yh +++++ anys 


n = n _ 


where Ag = ao, Aj = a1 + eao, A2 = az + Zea, + e7ap. Note that in general, 


k k\ 5 k 
Ap = ar + 1 eag_j + > eC ag_2 +++: +e%ao. (31.45) 


Since the determinant of the linear transformation is 1, it follows from the definition 
(31.14) of an invariant that 


(Ao, Al, ---,An) = I (a0, 41, -«+54n)- (31.46) 


Let Aay = Ax — ag and AI = I(Ao, Aj, ..., An) — 1(@0, 41, .-.,4n). By Taylor’s 
theorem in several variables 


1 F d a é 
O=Al= A LA +A bi \ F 
Dal Gap aE he ) 


1 a Or ae a . 
= a (cao dat + (2ea; +e «0 5D --) I. 


Since this is true for every value of e, the coefficient of every power of e must be 
zero. In particular, the coefficient of the first power of e gives 


r) ) 0 ) 
Qr= + 2. + 3 free 2 IT=0. 31.47 
(a0 eee Ce Gas ate iw) ( ) 


As Sylvester pointed out, this differential equation could also be obtained by taking 
the derivative of (31.46) with respect to e and applying the chain rule: 


fis aI dAq , al aA, | _ al aA, 
~ 9Aqo de | OA, Ve ' @An de 
al al al 

= Ao + 2A} t-+-+nAn—] . 
dA] dA dAn 


Similarly, apply the transformation 


X1=y1, xX2=eyi + y2 (31.48) 
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to get the other differential equation 


a a) 
Ol =({na, + (n — l)a ts+++ Ay = 0. (31.49) 
dag day Oan—-1 


The operators Q and O defined by (31.47) and (31.49) turned out to be quite 
important in invariant theory; Cayley and Sylvester made considerable use of them 
in their researches. The corresponding operators for covariants, defined by (31.15), 
are given by 


(2 = na) C=0, (31.50) 
x 


a 
(0 =i) C=0. (31.51) 
x2 


To prove that invariants are homogeneous and isobaric polynomials, take another 
special linear transformation 


X1=e€1y1, %X2 = e2y2. (31.52) 


In this case, the quantic is transformed to 


n = = n = ~ 
coet st + (Jane teoy? batt (Janet kek yn yk 4... + aneny? 


n of n = 
= Aoy| + (1) aust Saree (j,) Avot “yy te + Any, 
where Ag = age}.A1 = ayer le, ee ayes * ek 
Now suppose 
I = (ao,a\, ...,4n) = Swaine ay san, 


Since the determinant 6 of the transformation (31.52) is e1e2, we have by definition 
(31.14) 


I(Ao, Al, ---»An) = Uae? aiet ‘eo, ... anes) 
= SS unyane tare, ter ch are et Ls “aon el'n 
= > Sener . ain en +(n—1)sy++-+(n—k) sx ea bee bk spe +nsy 
= 6? I (ao, aj,...,4)) = eres Se maiaay a) an, 
Equating the coefficients of ay a; ---a® yields 
nso + (n— 1)sy +--- + —k)sp +++ + S-1 = Pp, (31.53) 


Sy +259 +---+ksp +---+nsy = p.- (31.54) 
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By adding these equations, one obtains 
n(so +s +++: +5) = 2p. (31.55) 


This in turn implies that an invariant J is homogeneous of degree 0 = sg + 
$1 +--+++ Sy. If we define the weight of a; to be k, then the weight of aa; mer, 
will be given by equation (31.54); this means that J is isobaric of weight p. 

In his 1855 memoir, using the same method,** Cayley proved a similar result for 
covariants. Suppose the covariant is given by 


C(a0, 41, «-.,4n,X1,%2) = Cox + C1 (4) stots eet Cn xt’. (31.56) 


Following the procedure we have presented, one may conclude that the coefficients 
Co, Ci, ..., Cm are homogeneous in ao, a1,..., dy and are of the same degree; call 
the degree 0. Each coefficient is isobaric, and the weights of the coefficients are given 
by C; = p+i,i = 0,1,...,m. The weight p of the coefficient Co is called the weight 
of the covariant and the integer m in (31.56) is called the order of the covariant. The 
argument yielding these results on covariants also shows that 


m =n — 2p. (31.57) 


Cayley also used the differential equations (31.50) and (31.51) satisfied by the 
covariant to derive important relations among coefficients of a covariant. For example, 
from the equation 


aC 
OC = X1z—> 
0x2 


where C is given by (31.56) and O is the differential operator (31.49), we have 


ocost + (Toca tan tot (POC sf 4+ OCuad 
m m 
= (@lar +2(7 )cae ta ie es 
ee il, \Ceuixt tak 4 nem 
pet k+1X] X95 seat m mX1|X5 . 


Equating coefficients produces the relations 
OC, = (m — k) Cra, k=0,1,...,m. (31.58) 


The first m of these equations imply the relations 


1 


= O*Co, k=1,2,...,m. 31.59 
nGi= laa De i hee) 


Ck 


33 Cayley (1855). 
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Since all the coefficients C), Co,..., Cy of the covariant C can be obtained from 
Co, Sylvester called Co the source of the covariant. 

By using the other differential equation for the covariant C, (31.50), Cayley derived 
the equations 


QC, = kCe_1, k=0,1,...,m. (31.60) 


In particular, the source satisfied the differential equation QCo = 0. Cayley and 
Sylvester named any homogeneous and isobaric function P of ag, a1, ..., Gy a Semi- 
invariant, or seminvariant, if it was annihilated by Q, that is, QP = 0. Thus, the 
source of a covariant turned out to be a seminvariant. Clearly, not all seminvariants 
are invariants. But Cayley pointed out that if a seminvariant of degree 6 and weight p 
satisfied equation (31.55), that is, n@ = 2p, then the seminvariant would also be an 
invariant. 


31.4 Cayley’s Generating Function for the Number of Invariants 


Cayley’s ambition was to develop an algorithm capable of producing all the invariants 
of a given binary form. In this pursuit, it was important for him to determine the 
number of seminvariants of given degree and weight. By 1856, he had discovered a 
beautiful connection between this problem and Gaussian polynomials. Recall that if J 
is any seminvariant of degree 6 and weight p, and 


then 
Sots te s+ +5, =O; sy +259 +--++n5, = p. 


Next, let N(n, 6, p) denote the number of seminvariants with given n, 0, and p, and 
let w, (0, p) denote the number of integer solutions of the two previous equations for 
Sx, With the constraint that s, > 0. The differential operator 02, when applied to /, 
keeps the degree of each term the same but reduces the weight by one. Note that 


QT = Yo dtsys,--s, 2aq? ay! + a5" = 0. (31.61) 


The number of terms in (31.61) is @, (0, p — 1), and the coefficient of each of these 
terms is zero. This implies that there are w,(@, p — 1) homogeneous linear equations 
for @n(6, p) quantities. In his second memoir, of 1855,°+ Cayley correctly assumed 
that these equations were independent and concluded that 


N(n, 6, p) = @n (8, P) — @n (8, p — 1). (31.62) 


Cayley was unable to prove his assumption but was so certain of its correctness 
that he based his invariant theory upon it. Sylvester provided a proof in 1878. Cayley 


34 Cauchy (1855). 
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argued that it was obvious that the number w, (6, p) would turn out to be the coefficient 
of x?z° in the series expansion of 


1 
(1 —z)( — xz) — x2z)-+- (1 — xz)" 


(31.63) 


Indeed, this is not difficult to see if we expand by the geometric series: 


(Lt ctor tee Lb xz $27 e7 peer (La xz $2? Ho), 


Clearly, the coefficient of x?z° will be equal to the number of nonnegative integer 
solutions of the two equations involving so, 51,..., Sn. 

Summarizing Cayley’s work in obtaining the invariants for forms of low degree, we 
observe that in 1855 he expanded the generating function (31.63) for wy (6, p): 


1 


77 ny a 2 3 eee 
ay Oe Gl Ae (31.64) 


To see the connection with Gaussian polynomials, change z to xz to get 


1 


She Go(x)x722 + G3(x)P4--. 
(I —xz)---(d —x"+1z) 1(x)xz + Go(x)x°z" + G3(x)x°z" + 


(31.65) 


The two equations (31.64) and (31.65) imply 


(l—z)\1+ Giz+ Goze? + +--+ Gmz"4---) 
=(1— x2) 4 Gaz os + Gyx + s-). 


Now equate the coefficients of z’” on both sides to get 
Gin — Gm—1 = Gmx"™ — Gm—1x"*" 


or 


]— mtn 
Gm(x) = ont 
(= xmtny(] a pene ly wha. (1 agenly 


(1 —x™)(1 — xml)... — x) 


(31.66) 


Thus, G,,(x) is a Gaussian polynomial and the coefficient of x? in the polyno- 
mial Gg(x) gives w, (0, p). Now Cayley realized that the number of seminvariants 
N(n,@, p) could be expressed as the difference between the coefficients of x? and 
x?—! in the Gaussian polynomial Gg(x), and this difference gave the number of 
invariants of degree 6 and weight p provided that n6 = 2p. Note that N(n,6, p) 
is the coefficient of x? in 
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d ee ae ig? Fe) meee ae ae) 
( —x2)(1 —x3)---( — x9) 


(31.67) 


Also observe that for invariants with weight p = we an (9, me) and we(n, we) are equal 


P no, 
because they both turn out to be the coefficient of x > in 


CHawil= 2 eels) 
(l= x)s l= 2?) 2 )23s (ar) 


This immediately implies that 


Wikae Va tome (31.68) 
n, aoe —_— ary ’ . 


a result, known as Hermite’s reciprocity theorem,*» established by Hermite in 1852 by 
a different method. Sylvester noted that this theorem was equivalent to stating that the 
number of partitions of any number p into at most m parts, with each part at most n, 
equaled the number of partitions of p into at most n parts, with each part at most m. 

Cayley proceeded by applying these results to determine the full invariant systems 
for forms of degree n = 2,3,4,5,6. For example, by (31.67), when n = 4, the number 
of independent invariants of degree 6 would be the coefficient of x7? in 


d — x9+)(] — x°t2)(1 Se ee el = xot4) 
(i — x2) — x3) — x4) : 


Observe that in order to find the coefficient of x2", we must retain numerator terms 
of degree 20 or less; this means that we should determine the coefficient of x7? in the 
power series expansion of 


Lax txt xt x) 1 ig x 
(-2)—x)d—*4)  d-*40-2)d-x4 dx) —x?2)d — x9)" 


This coefficient would be the same as the coefficient of x2 in 


1 
(i — x2) — x3) — x4) 


minus the coefficient of x° in 


x 
(1 —x)(1 — x”)(1 — x3) 


or minus the coefficient of x2° in 


x2 


(1 — x2)(1 — x4)(1 — x9) 


35. For Cayley’s remarks on Hermite’s reciprocity, see Cayley (1854) pp. 256-258. 
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Thus, we need the coefficient of x7? in 


1+x3 — x? 
(1 — x2)(1 — x4)(1 — x9)’ 


We may drop the odd power term x7; then, we would require the coefficient of 
x”? in 
1 
Cia) 


or of x? in 


= = (lt x? +x44---)U tx? tx° +--+). (31.69) 


Equation (31.69) implies that the number of independent invariants of degree 0 is 
equal to the number of integer solutions of 2m + 3n = 6. Clearly, in each case 6 = 2 
or 6 = 3, there is exactly one invariant, called J) or /3. For nonnegative integers m 
and n1, if 2m, +2n, = 0, then Le i? is an invariant of degree @. It is easy to see that 
all linearly independent invariants of a given degree can be produced by this method. 
Hence, /2 and /3 generate the full invariant system of a binary form, or quantic, of 
order 4. 

Cayley also showed how the differential operators could be used to determine the 
invariants J and /3. For instance, for /, since it is of degree 2, it must be of weight 4 
by the relation n6 = 2p. The binary form of degree 4 has coefficients ag, aj, a2, a3, da; 
therefore, the weight 4 and degree 2 monomials are aga4, a1a3, and as: To find an 
invariant J of degree 2 and weight 4, Cayley could set 


I = Aaga4 + Baya3 + Cas 


and then determine A, B, C by solving the differential equation Q/ = 0, where Q 
was defined by (31.47). One may easily check that 


Qaga4 = 4aga3, (a,az3 = aga3 + 3a1a2, Qaz = 4ajap. 
Thus, from 
QI = (4A + B)ajgaz + 3B +4D)ajaz = 0, 


Cayley had B = —4A and C = 3A. Hence, there was only one independent invariant 
in this case, given by 


In = agag — 4aja3 + 3a5; (31.70) 


one may check that the equation OJ = 0, from (31.49), is also satisfied. A similar 
calculation would determine the invariant of degree 3 and weight 6: 


Iz = aga2a4 agay ayaa + 2a,a2a3 — a (31.71) 
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Cayley’s result for n = 5 was that the number of independent invariants of degree 
6 was the coefficient of x° in 


eae ae a 
(1 — x4)(1 — x6) — x8)’ 


or the coefficient of x? in 


1 — x36 


dQ —x4)(1 — x8) — x!2)(1 — x18)" (31.72) 


The result (31.72) allowed Cayley to conclude that there were no invariants of odd 
degree, but that there was one irreducible invariant of degree 4, one of degree 8, one of 
12, and one of 18. However, these were connected by an equation of degree 36, that is, 
the square of the invariant of degree 18 was a polynomial function of the other three. 
Sylvester called such a relation a syzygy. Cayley attributed this result to Hermite.*° 
In fact, before studying Hermite’s work, Cayley had thought that the degrees of the 
invariants of order 5 binary forms had to be divisible by 4. 

In the case of n = 7, Cayley made a conceptual error. He stated that the number of 
independent invariants of degree 6 was equal to the coefficient of x? in 


Page oy gy ee. 
d—x)0 — x (1 — x8) — x!2)’ 


where the numerator was equal to 
(eae at ae ia ae, 


and where the series of factors did not terminate. Hence, he mistakenly concluded that 
the invariants did not have a finite basis. Gordan proved this to be incorrect. 


31.5 Sylvester’s Fundamental Theorem of Invariant Theory 


The counting method for finding the fundamental invariants, and Cayley’s conjecture 
in particular, were called into question when Cayley’s mistake became evident. But in 
1878, Sylvester succeeded in proving this basic result:*” 


N(n,9, Pp) = On, P) — On, p — 1). 
In his spirited style, his paper began: 
I am about to demonstrate a theorem which has been waiting proof for the last quarter of a century 
and upwards. It is the more necessary that this should be done, because the theorem has been 


supposed to lead to false conclusions, and its correctness has consequently been impugned . . . 
but the theorem itself is perfectly true, as I shall show by an argument so irrefragable that it must 


36 Cayley (1854) p. 257. 
37 Sylvester (1878). 
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be considered for ever hereafter safe from all doubt or cavil. It lies at the basis of the investigations 
begun by Professor Cayley in his Second Memoir upon Quantics, which it has fallen to my lot, 
with no small labour and contention of mind, to lead to a happy issue, and thereby to advance 
the standards of the Science of Algebraical Forms to the most advanced point that has hitherto 
been reached. The stone that was rejected by the builders has become the chief corner-stone of 
the building. 


We follow Sylvester’s reasoning very closely, but present it in slightly stream- 
lined form. The proof depends on Sylvester’s lemma that for a seminvariant 
F (ag, 1, ..-,@) of degree 6 and weight p, 

n=nd—2p>0. (31.73) 


To prove the lemma, begin with the observation that if U is any homogeneous, 
isobaric polynomial of degree 0 and weight p, then 


(QO — OQ)U = (n6 — 2p)U. (31.74) 


The result (31.74) was well known when Sylvester wrote his paper, but he presented 
an argument: 


Q0 — O02 
a) a) a) 
= + 2 1 + 3 2 pan l)ay_ 
nag Dao (n — l)ay aa (n — 2)ag Ian (n — 1)dp-1 Fc 
: 2(n — 1) : 2(n — 1) : 
— na, — — 2(n — l)ag— — +--+ — 2(n — 1)ay- —na 
ern * aa, yt NOG dan 
a a) a) 
= nay — + (n — 2)a;,— + (n-4Aaa— + -:: 
dao 0a daz 
a) a) 
— (n — 2)ay_ Nay : (31.75) 
Odn—1 dan 
If O5951---5,49 44! «++ Gq" is any monomial in U, then (31.75) implies that 
a a) a) 
(QO — OQ)U =n {ao + ay bets + ay U 
dao day ay 
a a Cl) 
2 + 2, free U 
(« da, a daz a: =) 


SOS] r 
= YM sps1-5n (nO — 2p)ay aj! +++ a," 


= (nO —2p)U = HU. 
So QO — OQ = n. Moreover, 


Q07 — 072 = (QO — ON)O + O(QO — OQ). 
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Since the differential operator O raises the weight by 1, we see that 
(QO — OQ)OU = (nO — 2(p + 1))OU = (yn — 2)OU. (31.76) 
Hence, 
207 — 0°2 = (n—2)0 + nO =2(n—-1)O. 
By induction one can show that 
20° - AO Q=rn—-r+hHo. (31.77) 
For a seminvariant F’, QF = 0; and so 
QO0'F =r(n—r+1)0'"'F. (31.78) 


To conclude the proof of Sylvester’s lemma, suppose that 7 is negative. In that case, 
Ir(y7—r-+1)|, forr = 1,2,3,... forms an increasing sequence of nonzero integers. 
Now O*F = 0 for some k < no — p +1. To understand this statement, note that F’, 
OF, O7F,... have weights p, p + 1, p + 2,..., but also have the same degree 0, 
and the greatest possible weight of any homogeneous polynomial of degree @ is n@, 
attained by a. Thus, p +k < n@ + 1. So let r be the value of k such that O’ F = 0. 
By (31.77), this implies O'~'!F = 0. It then follows that QO’—'F = 0, and hence 
O'~* F = 0. A repeated application of this procedure gives nF = 0 or F = 0; hence, 
n cannot be negative, proving the lemma. Sylvester also showed by induction that 


Q109F = n(2n — 2)3n —6)--- (qn — (q? —4))F 
=q! (nm — 1)(n-2)-:-M-—q4t+))F. 


(31.79) 


Now we are equipped to prove Cayley’s conjecture. Let D,, (p,@) denote the number 
of linearly independent seminvariants of degree 6 and weight p so that the conjecture 
can be formulated as 


Dn(p,8) = @n(p,8) — On(p — 1,8) = An(p, 8). (31.80) 


Observe that this equality holds if the w,(@, p — 1) equations satisfied by ds95)..-s, 
in (31.61) are independent. In any case, we have D;(p,0) > An(p,@). Note that 
D, (0,0) = @ (0,0) since both sides equal 1. It is also clear that 


Dy(p,0) + Dn(p — 1,0) +--+ + Dn (0,8) 
> An(p,9) + An(p — 1,0) +--- + An(0,4) (31.81) 
= @n(p,9). 


If equality holds in this situation, then we see that D,(w,@) = An(w,@) for all 


weights w < p. Since, for given n and 6, the weight w satisfies the inequality n6 — 


2w > 0, we have w < ae So the largest value of the weight would be ue when 
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no—1 


n@ is even, and it would be when n@ is odd. Let p stand for the maximum 
weight. Also let [p], [p — 1], [p —2], etc., denote semivariants of degree 6 in variables 
do, @1,..-, dy, and of weights p, p — 1, p — 2, etc., respectively. Then the number 
of linearly independent [p]s would be given by D,(p,@), and the number of linearly 
independent [p — 1]s would be D,(p — 1,8), and so on. So choose a set of Dn (p,9) 
independent [p]s, Dn(p — 1,6 — 1) independent [p — 1]s, etc. From this set construct 
a new set S in which all the forms have the same weight p. This can be done by 
applying the operator O4 to the D,(p — q,0) forms [p — q], since the weights of the 
forms O4[p — q] are all p. 

To prove that this set S of forms of weight p is linearly independent, we first show 
that any one set of O4[p — q] is independent; if not, then the members O%[p — q] of 
the set are connected by a linear equation. Apply the operator (2% to this equation. By 
(31.79), 27 O4[ p — q] is a nonzero constant multiple of [p — q]. But this contradicts 
the independence of the [p — g]s. Thus, we have shown that the subset consisting of 
O1|p — q] is independent. Now suppose that a linear relation holds among/between 
any number of subsets of the form O4%[p — gq] for which m is the largest value of 
q. Operate on this linear equation by 2”. For g < m, this operation will introduce 
quantities of the form Q”~4[p — q], but these will in fact vanish because [p — q] is 
a seminvariant and is hence annihilated by Q. Thus, only forms of the type [p — m] 
will remain after the application of Q”. This again gives us a contradiction because 
the seminvariants [p — m] were chosen to be independent. We can therefore conclude 
that the set S is linearly independent. Therefore, the number of elements in S cannot 
exceed wy (p,0@). By construction, the number of elements of S is given by 


Hence, this sum is less than or equal to w,(p,@). Therefore, by (31.81), equality holds 
and we have proved Cayley’s conjecture. 

Sylvester’s comments on his proof suggest that he may have been a keen student 
of Kant and valued mathematics as a creative endeavor. He wrote that his proof was 
accomplished “by aid of a construction drawn from the resources of the Imaginative 
reason, and founded on the reciprocal properties that have just been exhibited by 
the famous O and 2.” Later in the paper, he argued that proofs of this type showed 
that mathematics belonged among the liberal arts. “Whether we look to the advances 
made in modern geometry, in modern integral calculus, or in modern algebra, in each 
of these a free handling of the material employed is now possible, and an almost 
unlimited scope left to the regulated play of the fancy.”*8 


31.6 Hilbert’s Finite Basis Theorem 


David Hilbert (1862—1943) was one of the most influential mathematicians of his time. 
He is famous for advocating an abstract, structural approach to mathematical prob- 
lems, though his work on invariant theory had its algorithmic aspect. Hilbert studied 


38 Sylvester (1878) p. 185. 
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at KGnigsberg and attended lectures given by the outstanding teacher and number 
theorist Heinrich Weber (1842-1913). In 1882, Weber and Dedekind collaborated on 
an important paper in algebraic geometry, in which they presented Riemann surface 
theory from an algebraic perspective.*? It is clear that this work influenced Hilbert’s 
later approach to invariant theory. We note parenthetically that Weber wrote a three- 
volume work on algebra, useful even today. 

Hilbert proved his basis theorem for the general situation, beginning with any 
number of m-ary forms or quantics. In his 1890 paper,*? Hilbert employed a theorem 
of Max Noether, father of Emmy, to prove the basic lemma upon which he built 
his theory: If Fi, Fo, F3,... is an infinite sequence of forms, that is, homogeneous 
polynomials in n variables x1, x2,..., X, with coefficients in a field, then there exists 
an integer m such that every form in the sequence can be expressed as 


F=Aj\F\ + A2F)+---+ AmFm, (31.82) 


where Aj, A2,..., Aj, are appropriate forms in the same n variables. 

Using this lemma, Hilbert demonstrated that from an arbitrary collection of forms 
in n variables one can always choose a finite number such that every form in the 
collection is a linear combination of the chosen forms, as in (31.82). Hilbert proved 
this by contradiction, assuming the result false. Let F; 4 0 be a form in the collection 
and let F> be a form in the collection, but not expressible as A; F;. By our assumption, 
Fy exists. Now let F3 be a form not expressible as Aj F; + Az Fo. Again, F3 exists by 
supposition. In this way, we construct a sequence of forms F), F2, F3,... for which 
no number m exists to satisfy (31.82). This contradicts Hilbert’s lemma. We remark 
that in more modern books, this theorem is formulated in terms of polynomial ideals. 

Hilbert’s basis theorem for invariants states that there exists a finite number of 
invariants [;,..., J of a binary quantic or form Q such that any invariant of Q is 
some polynomial function of 11, ..., I. To prove this using Hilbert’s reasoning, let 
S denote the set of all invariants of Q. Though our treatment of this theorem is for 
only one form, note that Hilbert did not restrict himself to one form Q, but to a finite 
number of them. His conclusion on the finite basis for the simultaneous invariants is 
a generalization of the result for one form. Now these invariants are homogeneous 
and isobaric polynomials in the n + 1 variables ag, a1, ...,dp, the coefficients of the 
quantic. Hence there exist m invariants [;, Jo,..., [m such that every invariant J in S 
can be written as 


f= Qi +-+++ QmIn. (31.83) 


Now the forms Q1, Q2,..., Qm can be chosen to be isobaric in ao, a, ...,dn, 
but they need not be invariants. To get invariants from Q1, Q2,..., Qm, Hilbert 
constructed an operator using O and Q: 


39 For an English translation of this paper, see Dedekind and Weber (2012). 
40 For an English translation of this paper, see Hilbert (1978) pp. 143-224. 
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O2 07°22 039 
=r Tg) OT ae) Bh ay eke) 


This operator has the property that if F is any homogeneous and isobaric polynomial 
IN aQ, A1,..., An, of degree 6; and weight -1, such that nO; — 2p; = 0, then LF is 
either zero or an invariant. We shall present the proof of this property of the operator L 
after we have deduced Hilbert’s theorem from it. For this purpose, apply L to (31.83) 
to get 


LI=1= (LQ) +---+(LOn)In, (31.85) 


a result that follows from the easily proved facts that for any invariant J, LJ = IJ and 
L(QI) = (LQ)I. We must now show that LQ; is either an invariant or zero. Since 
I, I,..., Im in (1.85) are invariants, they satisfy the degree and weight condition 
n@ — 2p = 0, though the m + 1 invariants may have differing weights and degrees. 
Thus, the isobaric forms Q1, Qo,..., Qm also satisfy the condition nO — 2p = 0. 
Hence LQ, LQo,..., LQm must each be either zero or an invariant. Clearly, all of 
them cannot be zero for then J would be zero. Thus, the nonzero LQ;, LQ2,..., 
LQ m are members of the set S of invariants, and can once again be expressed in terms 
of Ii, I2,..., Im. However, the LQ; terms are of lower degree than J and the process 
will therefore terminate and every invariant J will be a polynomial in 1), Io,..., In. 

Hilbert did not bother to write down a proof of the required property of L. In his 
1895 book on the algebra of quantics, Edwin Elliott, Sylvester’s student at Oxford, 
gave a simple proof using the Cayley-Sylvester relation (31.74). Let G be a form in 
ao, Q1,..-, 4 with n = nO —2p > 0. The weights of QG, 022G, 23G,... are p-1, 
p—2, p —3,..., respectively, and hence the quantities corresponding to 7 become 
n+2,n+4,n+6,..., respectively. Thus, from (31.74) and (31.77), we have the 
relations 


Q0G — ONG =7n6G, 
20°2G — 07PL7G = 2(n + IG, 


20'A'G-—O'UG=ryntr—No''a’'e. 


Multiply the first equation by o the second by — .., the rth by 


1 
2n(n+1)?* 


Eile 
r!n(n+1)---M+r—1)’ 


and so on. Add the resulting equations to obtain 


1 1 1 
O24 O 
l-n 2!n+1) Saya Ny Gp) 


20| OP tease, (31.86) 
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This sum is finite since Q?+'G vanishes. Now replace G by QF where F is an 


isobaric form in do, a1,..., dy of weight p + 1, and write (31.86) as 
1 1 1 

aft O24 072? oa +| rao. 
a) 1-2-n(y +1) 3! n(n + 1I)(q +2) 


(31.87) 


Now substitute p for p + 1, enabling us to write n > —2 and7+2 > 0. Soif F is of 
weight p, we replace n by n + 2 in (31.87) to get 


1 1 ee) 
alpen 

Therefore, when 7 = nO — 2p = 0, we have (LF) = 0, and this means that LF 
is either an invariant or is identically zero. 

We note that in his doctoral thesis of 1885, Hilbert introduced the operator L, and 
other similar operators. He explained that L served as a generalization of transvection, 
an older method of producing covariants. The subject of Hilbert’s dissertation was 
special binary forms determined by algebraic differential equations and he mainly 
applied them to spherical functions. He took up this topic at the suggestion of his 
advisor at KOnigsberg, Ferdinand Lindemann (1852-1939), who is known for proving 
the transcendence of zr. 


31.7 Hilbert’s Nullstellensatz 


Hilbert’s aim in his 1893 paper on invariants*! was to subsume invariant theory under 
the general theory of algebraic function fields. This led him to a deeper proof of 
the basis theorem and to the creation of important new ideas fundamental to the 
development of twentieth-century commutative algebra and algebraic geometry. This 
proof of the basis theorem satisfied Gordan’s requirement in that it be algorithmic. We 
briefly discuss one of Hilbert’s important results, now known as the Nullstellensatz. 

Hilbert proved that for any form or quantic, or system of forms, there existed a 
finite number of invariants J), J2,..., 7; such that any other invariant J satisfied an 
algebraic equation 


Jet Gyr! 4 Gol? e+ Gy SH 0: (31.88) 


where G1, Go, ..., Gm were integral rational functions of 1, Jo, ..., I. By 
homogeneity, the functions G1, G2, ..., Gp could not have a constant term. With 
this result in hand, Hilbert considered forms whose coefficients had numerical values 
such that all the invariants 1), /2,..., Jx became zero, meaning that the value of all 
the invariants was zero, since by (31.88), /” = 0, or J = 0. Hilbert called a form null 
if all its invariants were zero. 


4] For an English translation, see Hilbert (1978) pp. 225-301. 
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The converse of this theorem is of interest. Suppose J), /2,..., J, are invariants 
such that their vanishing implies the vanishing of all other invariants of that form or 
quantic. Hilbert showed that under these conditions, any invariant J of this quantic 
satisfied an equation of the type (31.88). Hilbert based his proof of this converse on 
the result now known as the Hilbert Nullstellensatz: 

Suppose fi, f2,..., fm are m homogeneous polynomials in x1, x2,..., Xn, and 
suppose F\, Fy, F3,... are homogeneous polynomials in the same variables, such 
that they vanish for any values of the variables for which f),..., fi, all vanish. Then 
one can find an integer r such that every product 1 of r arbitrary functions from the 
sequence F), Fz, F3,... can be represented in the form 


NN” =a fi +00 fo +--+ ante, 


where aj, a2, ..., Gp» are appropriately chosen homogeneous polynomials in 
A Ds ale g Nis 


31.8 Exercises 


(1) Suppose the binary cubic form is 
gq = ax? + 3bx7y + 3cxy” + dy’. 
Show that 


6(q) = (ad — bc)* — 4(b? — ac)(c? — bd). 


See Boole (1841). 


(2) Suppose the ternary quadratic form is 


gq = ax* + by? +.cz” + 2dyz + 2exz+2fxy. 


Show that 


6(q) = abc + 3def (ad? + be? 4 cf’). 


See Boole (1841). 
(3) Let g = ax* + 4bx3y + 6cx7y? + 4dxy? + ey*. Show that 


6(q) = ae — 6ab*d*e — 12a*bde? — 18a7c7e? — 27a7d* — 27b*e? 
+ 36b2c*d? + 54a2cd7e + 54ab’ce* — 54ac3d* — 54b?c3e — 645° a? 
+ 8lacte + 108abcd? + 108b3cde — 180abc7de. 


See Boole (1844a). 
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(4) Prove Sylvester’s 1877 generalization of Taylor’s theorem: Suppose f is a 
function of a, b, c,... and f; is the same function of 


a, =a, bh} = b+ah, cy =c+2bh +ah’, d) =—d+3ch+3bh? + ah>,..., 


and let represent the operator 


Then 


h3 
12-37 


Thus, f; = f if and only if Qf = 0. According to Sylvester, this last statement 
makes the theorem important in the calculus of invariants. See Sylvester (1973) 
vol. 3, pp. 88-92. 

(5) Find the independent invariants of degrees 4 and 8 for a binary form of order 5. 
See Cayley (1889-1898) vol. 2, pp. 250-275. 

(6) Show that a binary quantic has exactly two linearly independent seminvariants 
of degree 5 and weight 5. See Elliott (1964) p. 132. 

(7) Show that a binary form of order 4n + 2 has a covariant of the second order and 
third degree. See Elliott (1964) p. 157. Elliott attributes this result to Hermite. 


fl=f+Q.fh+ (2)? eee of 


31.9 Notes on the Literature 


See Corry (2004) for the role of invariant theory in the development of the structural 
method in algebra. He also elaborates on the influence of Dedekind on Hilbert 
and Emmy Noether. For Kac’s very early work on the cubic, see Kalman (2009). 
K. Parshall’s article in Rowe and McCleary (1989), pp. 157—206, gives a history of 
nineteenth-century invariant theory before Hilbert. Crilly’s (2006) biography presents 
the development of Cayley’s mathematical thought with interesting details, especially 
in connection with invariant theory. The reader may also enjoy Hilbert’s (1993) 
lectures, given in 1897; the first sixty pages cover the work of Cayley and Sylvester. 
See also Elliott (1964); the first edition of 1895 presented a very readable exposition 
of nineteenth-century invariant theory in English, but it did not include the symbolic 
method of the German school. The 1903 book by Grace and Young (1965) filled 
this need. For recent works on invariant theory incorporating the classical methods 
of Cayley and Sylvester, see Olver (1999) and Sturmfels (2008). 


a2 


Summability 


32.1 Preliminary Remarks 


The subject of summability theory encompasses the variety of methods for averaging 
sequences, series, and integrals; it also includes the relationships among the various 
methods. This topic originated in the attempts to assign a value to the sum of a 
divergent series. Guido Grandi (1671-1742) made one of the earliest attempts, giving 
the sum of the series 1 -1+1—1+--- tobe 5 by setting x = 1 in the formula 


1 
= 1 x+x? Pe ee 
1l+x 


In a letter to Christian Wolf, published in 1713,! Leibniz reasoned that since the 
sum of the first n terms of 1 — 1+ 1—1+--- would be 0 or 1 depending on whether 
n was even or odd, the values 0 and 1 would occur with equal frequency, and hence 5 
was the most probable value of the sum. This method amounts to taking the limit of the 
averages of the partial sums assigned to the series 1 — 1+1—1+--- as the number of 
terms gets larger and larger. Note also that 1 —x +x? —x3+--- may be seen as a type 
of weighted average of 1 — 1+ 1—1-+---. Newton also dealt with divergent series, 
although in unpublished work. A significant example is his transformation formula, 
now named after Euler, 


CO CO 
Ae _ Soy AAG, 
n=0 n=0 
where y = ;+,. Newton discovered this transformation in 1684, but it unfortunately 


remained unpublished for almost three centuries.” He used it to evaluate the alternating 
series for In(1 +.) and for arctan x, taking the absolute value of x to be greater than 1. 
Newton explained that this transformation could be applied to convert an alternating 
divergent series to a convergent one; then, the value of the divergent series would be 


! Leibniz (1713). 
2 Newton (1967-1981) vol. 4, pp. 604-611. 
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given by the corresponding value of the convergent series. From this we can see that 
Newton’s ideas on divergent series were groundbreaking. 

Between 1720 and 1740, de Moivre, Stirling, Euler, and Maclaurin gained signifi- 
cant, though partial, insights into divergent asymptotic series. Their method, based on 
the Euler—Maclaurin summation formula, was to begin with a finite series and convert 
it to an infinite asymptotic series, yielding an excellent numerical approximation of the 
finite series. Interestingly, in the twentieth century, Ramanujan also used the Euler— 
Maclaurin formula in an attempt to construct a theory for summing divergent series.? 

Euler and Lagrange also made considerable use of divergent series in their work, 
though Euler’s work was clearly more incisive. In 1749, Euler gave a brilliant 
application of summability by defining 

(P59 Ba eS Nim Ce Oe EB Se (32.1) 


x17 


an equation he used to discover the functional relation for the zeta function.’ Recall 
that Euler’s initial motivation may have been to study the series on the left, hoping 
it would illuminate the problem of summing the zeta value ¢(2n + 1) where n was a 
positive integer. By generalizing (32.1), we may say that Euler defined the sum of the 
series (°° 9 dn by the equation 


[o,e) CO 
> an = Jim ye Aanx”. (32.2) 
n=0 n=0 


As discussed in Chapter 4, in 1826 Abel proved that if }* a, was convergent, then 
(32.2) would hold. For this reason we say that if the value of the limit in (32.2) is taken 
to be L, then the series }° a, is Abel-summable or A-summable to L. Expressed in 
another way, the Abel mean of )° a, is L. Although Euler defined this summability 
method, it is named after Abel. As a matter of fact, when n is a positive even integer, 
then the value of the series on the left-hand side of (32.1) sums to 0. As we have 
mentioned, Abel ironically called this situation “horrible” and in a letter to Humboe, 
quoted Horace: “Risum teneatis, amici.” [Restrain your laughter, friends. ]° 
Interestingly, in the 1820s, Poisson applied Abel summability to the convergence 
of Fourier series.° Recall that Fourier claimed in his famous 1807 memoir and other 
works that an arbitrary function could be expanded as a Fourier series; though he 
presented several ingenious arguments in favor of this proposition, he did not provide 
a real proof. In a paper published in 1820, Poisson attempted to demonstrate that 
the Fourier series of a continuous function converged to that function by showing that 


aol fel = 
lim [{ —ap + ) (an cosn@ + by sinnO)r” | = f (0), (32.3) 
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n=1 


3 See Hardy (1949) pp. 346-347. 

4 Bu. 1-15 pp. 70-90. E 352. 

5 See Ore (1974) p. 97 or Stubhaug (2000) pp. 343-344. 
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where a, and b, were the Fourier coefficients of a continuous function f(@). Poisson 
showed that the expression within parentheses in (32.3) could be expressed as 


Gt ee [ ee d 
ie Loe 0 (=e =o e, 


now called the Poisson integral. He then gave an argument that as r approached 1, the 
integral approached f(@), but this argument was full of gaps. But even had Poisson’s 
proof of (32.3) been complete, he undermined it from the beginning by falsely 
assuming the converse of Abel’s theorem. Recall that Cauchy made a similar error 
at around the same time. Tauber and Littlewood later established that the converse of 
Abel’s theorem required a growth condition on the coefficients. Cauchy, Abel, and 
others concluded that divergent series had no sum, effectively banishing this topic for 
nearly fifty years. It was only after the theory of convergent series was established on a 
sound footing, through the efforts of Gauss, Cauchy, Abel, Dirichlet, and Weierstrass, 
that mathematicians could confidently address the summability of divergent series. 

The German mathematician Ferdinand Georg Frobenius (1849-1917) initiated the 
modern theory of summability by proving the first theorem establishing a relation 
between two different methods of summation. In a short paper of 1880’ he showed 
that if s, = 779 ax and 
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n+1 
then 
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This theorem explained why Grandi and Leibniz obtained the same value for the 
sum of the series 1 — 1+ 1—1-+---. Frobenius was a student of Weierstrass, and he 
initially worked in differential equations and their series solutions. He branched out 
into number theory and algebra with particular emphasis on groups. In answering 
a question of Dedekind on group determinants, Frobenius created and developed 
the topic for which he is best known, group representation theory. Two years 
after Frobenius’s important paper, Otto Holder (1859-1937), who also studied with 
Weierstrass, extended that work. He defined® 


HO + HO 4-4 HO 
n+1 


Het) — , r=0,1,2,..., (32.5) 


where A = sx. He pointed out that there were sequences so, 51,52,... for which 
the limit (32.4) did not exist but such that there was an integer r for which the 
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limy—s oo H® existed. Thus, such a series )> a, is said to be (H,r) summable. 
Moreover, Holder proved that if limy_, HO = §S, then lim,_,)- 6 anx" = S. 

The Italian mathematician Ernesto Cesaro (1859-1906), in spite of financial and 
other challenges, managed to learn mathematics from a number of good teachers, 
obtain positions in Italian universities, and publish prolifically in differential geometry 
and number theory. He studied under Eugéne Catalan in Liége and spent a year 
in Paris attending lectures by Hermite and Gaston Darboux. He had wide interests, 
including mathematical physics. In 1890,? Cesaro gave an important application of 
the summability method (32.4), shedding light on a classical question on products of 
infinite series: Suppose that }* a, = A and )° b, = B, and let the Cauchy product of 
these two series be }° cn, where cn = dobn + aibn—1 + +++ + anbo. When does the 
Cauchy product converge? Cesaro proved that even when the product did not converge, 
the limit of the arithmetic means of the partial sums of the product would converge to 
AB. In other words, if C, = co +c) +++: + cy, then 


Cot Ci t---+Cy 
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> ABasn > oo. (32.6) 


Note that Cesaro’s theorem generalized Abel’s theorem that if )> a, = A, )> bn = B, 
and }\c, = C, then AB = C. In today’s terminology, we would say that a series 
>= a, is Cesaro summable or (C,1) summable to S if (32.4) holds true. We may also 
say that the Cauchy product of two series converging to A and B is Cesaro summable 
to AB. Cesaro next extended his result to not necessarily convergent series: If 
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Cesaro also defined a more general form of convergence, starting with 


k k+1 k 
( +P) And = dn + ( : Janie ( *") a (32.7) 


He defined a series }* a, as summable (today we say (C,k) summable) to A if 
there was a k such that limyp— oo An,k = A. Note that it is not necessary for k to be a 
nonnegative integer. Since we can write 
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we may take k to be areal number > —1. 

In 1900, the Hungarian mathematician Lipot Fejér (1880-1959) delivered a big 
boost to the Cesaro summability method by proving that the Fourier series of any 
continuous function was (C,1) summable to the function.!® It is interesting that 
Fejér’s result arose out of an earlier attempt to solve the Dirichlet problem for the 
unit circle: For a continuous function f(@) on the unit circle, determine a harmonic 
function (x, y) = ®(r,A) inside the unit disk such that ®(r, 2) tends to f(@) as rei* 
approaches e!” from inside the unit disk. Note that @(x, y) would be harmonic if it 
satisfied Laplace’s equation 

ao do 
x2 + aye 0. (32.8) 

In 1870, Carl Neumann (1832-1925), son of Franz Neumann and one of the 
founders of the Mathematische Annalen, made an attempt at solving this problem 
by means of the harmonic function determined by the Poisson integral P(r,6@). He 
used Poisson’s result, that P(r,0) tended to f(@) as r — 17. Recall, however, 
that the proof given by Poisson was incomplete; this in turn undermined Neumann’s 
proof. As a third year student at the Technical University in Hungary, Fejér spent 
1899-1900 in Berlin, attending lectures by L. Fuchs, Schwarz, and Frobenius, all 
students of Weierstrass. Fejér learned of Neumann’s attempt from Schwarz, who in 
1871 had solved the Dirichlet problem by an alternative method. Examining the gap 
in Neumann’s proof, Fejér proved the (C,1) summability of the Fourier series of 
a continuous function. This result, combined with Frobenius’s theorem that (C, 1) 
summability implied Abel summability, mended Poisson’s proof. As a corollary, Fejér 
obtained the theorem that a continuous function could be uniformly approximated by 
trigonometric polynomials on a closed interval. Since the sine and cosine functions 
could be approximated by their Taylor polynomials, he further deduced Weierstrass’s 
theorem on the uniform approximation of continuous functions by polynomials. 

Similar to his first mathematical efforts, a number of Fejér’s later papers presented 
elegant solutions to interesting but circumscribed problems, where both the problems 
and the solutions had significant implications in several areas. While a professor at 
Budapest, he had a broad influence on the development of mathematics in Hungary. 
His mathematical style, his outgoing personality, and his wide-ranging cultural 
interests attracted many good students, including Erdés, Pélya, Szegé, Turan, and von 
Neumann. 

The Austrian mathematician Alfred Tauber (1866-1942) was a professor of 
mathematics at Vienna and an accomplished actuary; he served as chief mathematician 
for the Phénix Insurance Company. He and Georg Pick died in the Theresienstadt con- 
centration camp at about the same time. Tauber gave a new direction to summability 
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theory with a result on a converse of Abel’s theorem on series.'! He proved that if 
>= ad, was Abel-summable to A and na, — 0, or ad, = o(+), asin — ox, then 
>> ay = A. Tauber also proved an Abel summable series )° a, to be convergent if and 
only if 


a, +2a2 +--+ +ndy 
n 


>Oasn > ow. (32.9) 


In a paper of 1907, the German analytic number theorist Edmund Landau (1877- 
1938), extended Tauber’s theorem to series of the form pews dne*"*, where Ay < 
A2 <--- anda, > co asn — ov. Note that this covers power series as well as 
Dirichlet series.'!* Landau also proved an integral analog of Tauber’s theorem: If 


J(x) = te f@t*dt> A as x0 (32.10) 
1 
and 
f(t) =o (=) as t>O, (32.11) 
then 
J(0) = i f@jdt=A. (32.12) 
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Recall that in 1749, Euler attempted to use Abel summability to prove the 
functional equation for the zeta function. In 1906, Landau vindicated Euler’s efforts by 
proving that the Abel sum of the series )°°° ; & ye yielded the value (1—2!~*)¢(s), 
obtained by the analytic continuation of the zeta function.!? In this work, Landau 
employed an 1898 result of the Finnish mathematician Hjalmar Mellin. Landau, 
a student of Frobenius, also introduced the one-sided Tauberian condition on the 
coefficients of series, especially applicable in number theory. In 1903, Landau derived 
the prime number theorem!* without using Hadamard’s theory of entire functions of 
finite order; in 1907, he obtained an important generalization of Picard’s theorem 
on entire functions.!> Concerning Landau’s 1927 Vorlesungen iiber Zahlentheorie, 
Hardy and Heilbronn wrote in an obituary notice for Landau, “This remarkable work 
is complete in itself; he does not assume... even a little knowledge of number-theory 
or algebra. It stretches from the very beginning to the limits of knowledge, in 1927, of 
the ‘additive, ‘analytic,’ and ‘geometric’ theories.”!® 

The preliminary summability results of Frobenius, Cesaro, Fejér, Tauber, 
and Landau laid the foundation for a cohesive theory of summability with wide 
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applicability. The British mathematicians G. H. Hardy (1877-1947) and J. E. 
Littlewood (1885-1977) were the first to fully understand the potential and scope 
of this mathematical theory. Hardy’s many mathematical contributions included the 
circle method, discovered jointly with Ramanujan in their work on the asymptotic 
theory of partitions; and the concept of maximal functions, developed in collaboration 
with Littlewood. His influence was felt as much through his teaching as in his 
research. He helped raise British standards of teaching in analysis by publishing 
his 1908 A Course of Pure Mathematics, still in print today. In his preface to the 
1937 edition of this book, Hardy remarked that if he were to rewrite the book, 
“T should not write (to use Prof. Littlewood’s simile) like “a missionary talking to 
cannibals,’ but with decent terseness and restraint.”!’ Hardy enjoyed mathematical 
collaboration, and his association with Littlewood was one of the most productive 
in the history of mathematics. They published almost one hundred joint papers in 
analysis and analytic number theory. According to Harald Bohr’s birthday lecture of 
1953, reprinted by Bollobas in his foreword to Littlewood’s Miscellany, the Hardy— 
Littlewood collaboration was based on four rules:!8 


(1) When one wrote to the other, it was completely indifferent whether what they wrote was 
right or wrong. 

(2) When one received a letter from the other, he was under no obligation whatsoever to read 
it, let alone answer it. 

(3) Although it did not really matter if they both simultaneously thought about the same detail, 
still, it was preferable that they should not do so. 

(4) It was quite indifferent if one of them had not contributed the least bit to the contents of a 
paper under their common name. 


Although both Hardy and Littlewood lived on the Trinity College grounds, within 
one or two hundred yards of one another, and ate their meals in the same dining 
hall, their rules suggest that most of their communications were via the written word. 
Littlewood, unlike Hardy, had an interest in applied mathematics. In collaboration 
with Mary Cartwright (1900-1998), he also made important contributions to nonlinear 
differential equations and topological dynamics. Concerning Littlewood, V. I. Arnold 
wrote, “In mathematics he was a direct successor of Newton and Poincaré, doing 
research even on artillery ballistics. I was surprised to discover his estimates of the 
time of preservation of an adiabatic invariant in a Hamiltonian system.” It is even 
more surprising that the ‘theory of chaos’ in dynamical systems, including ‘Smale’s 
horseshoe,’ had been already developed and published by Littlewood.”!® 

In a 1909 paper, Hardy showed that if }’ a, was (C,1) summable to S and 
an = O(t) then )° a, converged to S. He noted that by combining this result with 
Fejér’s (C,1) summability of the Fourier series of a continuous function f(x), one 
obtained Dirichlet’s theorem on Fourier series. Take f(x) to be monotonic, and apply 
the second mean value theorem 


'7 Hardy (1937). 
18 Littlewood (1986) pp. 10-11. 
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to see that the Fourier coefficients are O(3). Hardy was not successful in his attempt 
to prove the more general result that if }* a, was Abel summable to S and a, = O (4), 
then )’ a, =S. In fact, he thought the result could well be false. He suggested 
the problem to his former student Littlewood, who succeeded in solving it in the 
affirmative. In his “A Mathematical Education,” Littlewood gave an account of his 
discovery of the proof.” Surprisingly, as he grappled with the problem, he forgot that 
Hardy had already proved the Cesaro-Tauber theorem. In 1911, during his attempt to 
reprove this, he discovered the derivatives theorem.”! Note that in intuitive terms, the 
derivatives theorem states that the orders of magnitude of two derivatives of a function 
restrict the order of magnitude of the intermediate derivatives. Hardy and Littlewood 
made considerable use of this concept in their early work. But, as they mentioned 
in a paper of 1914,?* Hadamard had already proved the derivatives theorem and had 
published it in an 1897 paper on waves. Indeed, A. Kneser also independently obtained 
the theorem in the same year. Littlewood stated his Abel-Tauber theorem in the general 
form that included Dirichlet series: 


If0 <j < dd < ++: < Ay > Wan ow, 
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Observe that when A, = n, the sum in (32.13) reduces to a power series, whereas 
when 4, = Inn, one gets a Dirichlet series. Littkewood’s condition |a,| = O(3) is 
quite natural, since is it is easy to see that under this condition, if 5° a,x” oscillates 
finitely as x — 17, then so does the sequence ae a, asn — oo. In fact, Littlewood 
pointed out that the condition |a,| = o(+) implied the much stronger result: that the 
limits of oscillation of }° a,x” as x > 17 and of )°y_| ag as n — oo were the same. 
These results must have suggested to him that in order for Abel summability to imply 
convergence, a weaker condition would suffice. 


20 Littlewood (1986) pp. 80-93. 
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In an interesting 1910 paper,”? Landau proved Hardy’s (C, 1) summability theorem 
with a weaker one-sided Tauberian condition na, > —K, where K was a constant. 
He mentioned that one-sided Tauberian arguments had been used by Hadamard and 
Vallée-Poussin in their proofs of the prime number theorem (PNT). In a 1913 paper,”4 
Hardy and Littlewood proved a one-sided extension of Littlewood’s theorem: If 
ad, > 0, a > 0, and 


[o,e) 
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Hardy and Littlewood soon saw that this theorem had important implications in prime 
number theory. They showed that”> 


CO 
lim A(nye—"® = 1, 32.17 
eee (n) (32.17) 


and since A(n) > 0, the hypothesis of their theorem, given by (32.15), was true with 
a = 1,x =e §,a, = A(n), and A = 1. Recall that when n is a positive integer 
power of a prime p, A(x) = In p; otherwise, it is 0. 

Next, by (32.16) 
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It was well known that (32.18) was equivalent to the PNT, and to prove (32.17) 
Hardy and Littlewood needed the fact that with s = o + it, ¢(s) had no zeros on 
o = | and satisfied a very mild growth condition for large t and 1 < o < 2. This 
growth condition was so weak that they concluded that there should be a proof of 
the PNT requiring only ¢(1 + it) ¥ 0 for real t. In looking for such a proof, they 
investigated the Lambert summability method. Note that a series }* a, is Lambert 
summable to S if 


x” 
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In a paper written in 1919, Hardy and Littlewood proved that Lambert summability 
implied Abel summability.2° From this theorem they could easily derive a result 
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equivalent to the PNT. Unfortunately, this did not give a new proof of the PNT because 
to prove their Lambert summability theorem, they had used the fact that, with jz the 
Mobius function, 


. 1 
g(n) = mn) =H (<3) (32.20) 
m=1 


Thus, they relied on a result a little deeper than the PNT, since the PNT is equivalent 
to g(n) = o(1) asn —> o. Though they failed to offer another proof of the PNT, their 
work set the stage for Wiener. In 1928, Norbert Wiener (1894-1964) found a method 
to directly handle Lambert summability. Wiener received his doctoral degree from 
Harvard University at the age of 18 with a thesis in logic. He then spent a part of 
1913 at Cambridge University to study under Bertrand Russell who advised him to 
study mathematics and physics, especially the papers of Einstein and Niels Bohr on 
relativity, Brownian motion and quantum theory. Wiener was greatly impressed and 
influenced by Hardy’s course on real and complex variables and all of this bore fruit 
about a decade later. Wiener was a professor at M.LT. from 1919 to his death in 1964. 
He interacted vigorously with his engineering colleagues. The electrical engineering 
department requested that he provide a rigorous basis for Heaviside’s operational 
methods. This work led Wiener to a very fruitful study of a generalized harmonic 
analysis. He encountered a technical problem in his harmonic analysis research: Show 
that for a class of nonnegative functions f (¢) 


ee [ Odean = a pout dt (32.21) 
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At this point in his researches, in 1926, Wiener was visiting Gottingen, as was 
his friend, the English mathematician A. E. Ingham. Wiener learned from Ingham 
that his problem was Tauberian in nature and that Hardy and Littlewood had worked 
on similar problems. Wiener corresponded with Hardy on this question but finally 
decided to follow his own approach, using Fourier transforms. In his autobiography, 
I Am a Mathematician, Wiener wrote that he also consulted Toeplitz’s student R. 
Schmidt, who had published an important paper on Tauberian theory in 1925. Wiener 
had hoped to collaborate with Schmidt on this problem, for there was a connection in 
their approaches, but this collaboration did not work out. However, Schmidt suggested 
that, since his own method had failed for Lambert summability and the PNT, Wiener 
might test his own approach in those cases. Wiener was soon able to discover a 
comprehensive method, covering all known Tauberian results. 

To get a sense of Wiener’s work, begin by writing the Abel sum of )° a, in the 
form 


(oe) CO 

1 n 

= ii = "— lim — ar: 

A= ae r) ae Jim, : a ; (32.22) 
n= ‘i 


32.1 Preliminary Remarks 191 


With this, we have another form of the Hardy—Littlewood theorem: If 


; 1 _n . 1 
tim, =) J sne x =A and s, = O(1),_ then dim = Dos =A. (32.23) 
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Now we can write the integral analog: If F(t) is bounded and 
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Note that the first limit in (32.24) is a weighted average of the function F(t) 
where the weight function is given by e-*. More generally, let the weight function 
be expressed as G(4) so that the integral takes the form 


iy o()r dt = / e" *G(e" ’)F(e’) dy = / K\(u — y) f(u) du, 
0 ~00 ng 
(32.25) 


after applying the change of variables t = e“, x = e?, F(e") = f(u), e” °Gtu — 
y) = K,(u — y). Wiener could then pose the very general question: Given a bounded 
function f(u) and kernel K; integrable over (—oo, 00), under what conditions does 
the equation 


joes Kiu—y)f(u) du = af K,(u) du (32.26) 
imply 
sim, | Ko(u— y) f(u) du = af Ko(u) du (32.27) 


for a different integrable kernel K2? To determine a simple condition on K;, Wiener 
assumed that K2 was a convolution of K; with an integrable function R, that is 


Ko(y) = / Ki(Qy —u)R(u) du. (32.28) 


Now note that the Fourier transform converts a convolution of two functions to 
the ordinary product of the transforms of the two functions. So, where K denotes the 
Fourier transform of K, 


Ky = Ki -R. (32.29) 


The beauty of relation (32.29) is that it allows us to determine R at all points if for 
all x 


K(x) = / e "Ki (t)dt £0. (32.30) 
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This was Wiener’s now-famous condition, that the existence of the first average 
would imply the existence of the second. In his 1932 paper “Tauberian Theorems,” 
Wiener stated two forms of this theorem.”’ Note that he wrote L p for L?. The first 
version of Wiener’s Tauberian theorem: Let f(x) be a bounded measurable function, 
defined over (—co, cc). Let K,(x) be a function in L, and let 


= hs Ki (x)e' dx £0 (32.31) 
for all real u. Let 
Jim, [ fE)KiE —x) dF = A | K\(&) dé. (32.32) 


Then if K2(x) is any function in L;, 
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Conversely, let K;(€) be a function of L1, and let ped K,(€) dé # 0. Let (32.32) 
imply (32.33) whenever K2(x) belongs to L; and f(x) is bounded. Then (32.31) 
holds. 

In his initial 1928 form of the theorem,”* Wiener required a growth condition O(zs) 
at too for the kernels K1(€) and K2(&). In the 1932 version, he refined his theory 
by means of his well-known theorem on absolutely convergent Fourier series: If a 
nonvanishing function f has an absolutely convergent Fourier series, then 4 has an 
absolutely convergent Fourier series. Although this was a difficult result, it emerged 
less than a decade later as a corollary of I. M. Gelfand’s work on commutative Banach 
algebras. Wiener stated a second general theorem, directly applicable to infinite series, 
involving Stieltjes integrals; he derived a form of the PNT from this result. Thus, 
Wiener got his second Tauberian theorem: Let f(x) be a function of limited total 
variation over every finite range, and let 
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and 


Jim / Ky —x)df(é) =A / K(&) dé. (32.37) 


If K2(x) is a continuous function in L, satisfying the condition (32.35), then 
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Note that Wiener also stated a converse of this theorem. Then in 1938, H. R. 
Pitt (1914-2005) formulated a simple theorem containing both Wiener theorems as 
corollaries. Pitt took undergraduate courses from Hardy and Littlewood at Cambridge 
in the 1930s. After graduation in 1936, he studied under Wiener at M.LT. In his 1938 
paper “General Tauberian Theorems,””? Pitt proved: Suppose K(x) € L1(—00,00) 
and its Fourier transform K (t) does not vanish for any real ¢. If f(x) is bounded, 
slowly oscillating, that is 


fy) -— f@) > Owhen y > x,x > wo, y-x > 0, (32.39) 


and 
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The Serbian mathematician Jovan Karamata (1902-1967) also made an important 
contribution to Tauberian theory. In 1930, he published a two-page proof of the Hardy— 
Littlewood theorem, that Abel summability with a one-sided condition implied Cesaro 
summability.°° Karamata’s proof used only the Weierstrass approximation theorem to 
prove his main result that if a, > 0 and }* a, was Abel summable to s, then for every 
Riemann integrable function g(x), 


oo 1 
lim (1 =x) $0 anx" g(x") =s/ g(t) dt. (32.40) 
Xk AO 0 


This elegant proof took researchers in Tauberian theory completely by surprise, 
since up to that time all the proofs of the Hardy—Littlewood theorem had required 
a fair amount of machinery. Karamata graduated from the University of Belgrade in 
1925, where he came under the influence of Mihailo Petrovié (1868-1943) who had 
studied at the Ecole Normale in Paris under Hermite, Poincaré, and Picard. Petrovié 
brought to Serbia the spirit of scientific research he learned in France. By the time he 


29 Pitt (1938). 
30 Karamata (1930). 
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met Karamata, he had ceased to do mathematical research but he advised Karamata 
to study the latest mathematical discoveries. Karamata regarded himself as self-taught 
and would say that his teacher in classical analysis was Polya and Szegé’s Aufgaben 
und Lehrsdtze aus der Analysis, published in 1925. In fact, the topic of Karamata’s 
doctoral thesis was the development of Weyl’s work on the uniform distribution of 
sequences x1,*2,x3,... in the interval (0,1). We observe that Weyl’s theorems were 
given as a set of five problems in Pélya and Szegé’s book. The first of these problems 
was to show that a sequence x1,x2,x3,... in (0,1) was uniformly distributed if and 


only if for every Riemann integrable function f 


“aha 1 
fim SOWA SOD ++ FOn) _ [ Heyty coe 
n—->0oo n 0 


One may compare this with Karamata’s theorem. Again, it is interesting to note 
that, following their section on uniform distribution, Pélya and Szeg6’s book posed 
a problem requiring the use of Weyl’s formula as well as Frobenius’s theorem on 
summability. Karamata also introduced the important concept of a regularly varying 


function. 


32.2 Fejér: Summability of Fourier Series 


In 1900, L. Fejér made an application of (C,1) summability to Fourier series by 
proving that the Fourier series of f was (C,1) summable to LOFOE LENO) at every 
point where f(x + 0) existed?! He assumed that f was bounded and integrable on 
[0, 27r ]. Recall that the Fourier coefficients are given by 


1 20 1 20 
ay = -| f(t)cos nt dt, by = - | f(t) sin nt dt 
wT JO wT JO 


and that the nth partial sum of a Fourier series is given by 


1 n 
Sy(xX) = 500 + Sax cos kx + by sin kx) 
k=1 


1 20 n l on 
=~ [ riod +97 f f(t) cos k(t — x) dt. 


Fejér began his proof with the observation that 


1 cos (n — 1)@ — cos né 
2 1 —cos 0 ‘ 


1 
a a + cos 8+---+cos(n— 1)0= 


31 Fejér republished his paper in the Math Annalen in 1904. See Fejér (1904). 
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hence, 


2 
Ce ts eee ee 1 1—cosné6_ 1 sin (%) 
n ~ 2n 1—cos@  2n 


sin ($) 


Thus, for the arithmetic mean of the partial sums, he had 


so(x) + 51(x) 4 2) bese + Sy_1(X) 


Sn(x) = 
sin nu \? 
-—f fees ) du. 
x sin u 


Fejér immediately perceived that this integral was simpler than the one found by 


Dirichlet for the partial sum s,(x) because the kerne sin’ v 


was always nonnegative, 


unlike the corresponding kernel in Dirichlet’s integral sna Fejér first considered 
the case where f was continuous at x. He let € > 0, so that there existed a 5 > 0 
such that 


[f(x +h) — f(x)| <e for |A| <6. 


We note that Fejér’s notation interchanged € and 6. He next wrote the integral for 
Sn (x) in three parts: 


x-6 14 _ 
Sn (y= : ee fat 


2nn 0 1 —cos(t — 
1 x18 1 = cos nit = 


2na J,x-s 1 —cos(t 


1 2 1 — cos n(t— 
/ rina. 


2nm Jy4s 1 —cos(t 


He assumed | f(t)| < M in [0,27]. Then the absolute values of the first and third 
integrals were bounded by aaa For the second integral, the positivity of the term 
multiplying f(t) implied that 


*+8'1 — cos n(t — *+8 1 _ cos n(t — x) 
/ eT pat = cron Wi cadia 


_s l-—cos(t — cos (t — x) 


where |n| < €. Fejér then noted that 


1 an ie 
/ cos n( x) 
0 


2n1 1 —cos(t — x) 
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Thus, 


1 x+3 1 _ cos n(t — x) 


2nn x-§ 1—cos(t— x) 
Es 1 pea, 1 fea . 
2nx Jo 1 —cos(t — x) 2nm Jyx43 1 —cos(t — x) 


With all this 


He observed that each of the last two integrals was less that 
information, he could conclude that for n large enough 


2 
n(1—cos 4) ° 


|Sn(x) — f@)| < 2e. 


This proved Fejér’s theorem for the case in which f was continuous at x. Assuming 
only the existence of the limits f(x — 0) and f(x + 0), Fejér broke the integral for 
Sn (x) into two parts: 


* 1—cos n(t —x 


_! ) 
HO) Qn i 1 — cos (t — x) ae 


25 1— cos n(t—x 


Saal ) 
(x) = 2nm1 | 1 — cos (t — x) p@ae 


Then by a similar argument 


1 1 
slim N(x) = 5 Ff 0), lim f(x) = sf +0). 


Fejér went on to observe that if f(«) was everywhere continuous, then S,,(x) 
converged uniformly to f(x). He also noted the following immediate corollaries of 
his theorem: 


¢ If the Fourier series converges at a point of continuity of a function, then its sum 
is the value of the function at that point. 

¢ A continuous function on a closed interval is a uniform limit of a sequence of 
polynomials. This is Weierstrass’s approximation theorem. 

* Poisson’s integral yields a solution for Dirichlet’s problem for the circle. 


Hermann A. Schwarz was the first to prove the third result. He felt that a proof by 
Fourier series was probably not possible. As noted before, Fejér’s motivation in the 
discovery of his theorem was to provide a proof using Fourier series. 

Hardy recognized that his Tauberian theorem on (C,1) summability, combined 
with Fejér’s theorem, immediately yielded a result on Fourier series: If the Fourier 
coefficients of a continuous function f are a, = O(+) and by, = O(+), the Fourier 
series of f at x converges to f(x). Hardy then reasoned that since the Fourier 
coefficients of a periodic function f of bounded variation satisfied a, = O(+), 
by, = O(4), then the Fourier series of a such a function converged to 5 ( f(x +0)+ 
f (x —0)). In fact, this is the classical Dirichlet—Jordan theorem. Furthermore, observe 
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that since Cesaro summability implies Abel summability, it follows that for f as in 
Fejér’s theorem, we have 


1 
lim (5a + (a, cos x + by sin x)r + (a2 cos 2x + bz sin 2x)r? +... +) 
r>17 


1 
=5(f@ + 0) + f(x —0)). 


This equation simplifies to 


li Le he ie Aide |e" 10) 4 0 
Pee al i WeocGey er Sg Oe Nor he )). 


When Hilbert saw Fejér’s work, he requested Fejér to attempt a proof of a similar 
theorem for the Laplace series where a function f(0,@) was expanded in terms of 
surface harmonics. Fejér was unsuccessful in this effort for some years. Finally, while 
looking at a book on Bessel functions, he saw F. G. Mehler’s integral formula for 
Legendre polynomials: 


2 in(2 ne 
P,(cos@) = / suis ae dt, 0<0<T7Z. 
mw Jo 4/2(cos@ — cost) 


With the help of this result, in 1908 Fejér was able to prove that the Laplace series 
of a bounded integrable function was (C,2) summable to the function at any point of 
continuity.** In 1913, H. Gronwall proved that (C,2) could be replaced by (C,1).*° 


32.3. Karamata’s Proof of the Hardy—Littlewood Theorem 


Karamata’s short proof?* of Littlkewood’s theorem and of the more general Hardy— 
Littlewood theorem relied on Weierstrass’s approximation theorem. Karamata used it 
in the following form: For any Riemann integrable function g(x) on (0,1) and every 
€ > 0 there exist two polynomials p(t) and P(t) such that 


p(t) < g(t) < P(t) for O<1 <1, (32.42) 


1 
P(t) — p(t)) dt <e. (32.43) 
im ) 


Karamata did not give the details of the proof of this result. It can be proved, 
however, by first taking g(t) to be a continuous function. By Weierstrass’s theorem, 
there are polynomials p(t) and P(t) differing by at most § from g(t) — § and g(t)+§, 
respectively, for all t € [0,1]. Clearly, the required result follows for g(t) continuous. 
We next take g(t) to be piecewise continuous, and the result follows because g(t) can 


32 Fejér (1908). 
33 Gronwall (1913). 
34 Karamata (1930). 
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be approximated by continuous functions. Finally, for any Riemann integrable func- 
tion g(t), there are step functions m(t) and M(t) such that m(t) < f(t) < M(t) and 


1 
if (M(t) — m(0)) dt < £. 
‘ 2 


Karamata’s theorem: If a, > —K, with K > 0 independent of n and 


[o,e) 
el —x) Yo ayx" >A a xl, 
n=0 


then 


oe 1 
(Lx) 0 ang(x")x" > A / g(t) dt 


n=0 0 


for every Riemann integrable function g(t). 
In Karamata’s proof, it was obviously sufficient to take K = 0, for he could replace 
dn by a, + K. Karamata then supposed g(x) = x%, a > 0. Then he had 


(1—x) }\ ang(x")x” 


n=0 


CO 
= (1-x) Saar 
n=0 


= sees el ~ xt a Flies eet = af t dt 
1— xe! a 0 a+l 0 , 


as x —> 17. It followed by linearity that for every polynomial P(x), 


oo 1 
(Lx) So ay P(X")x" > af P(t) dt. 


n=0 0 


He could next apply (32.42) and (32.43) because a, was positive; Karamata’s theorem 
followed. ; 

To derive the Hardy—Littlewood theorem, Karamata set x = e * and let g(t) be 
the piecewise continuous function 


0 O<t 
wo={i x 
<t 


< 
LT, < 
t e = 
He then arrived at g(x”) = O form > n, g(x™)x™ = 1 form <n, and i g(t)dt = 1, 
thereby reducing his theorem to the Hardy—Littlewood theorem. In other words, given 
the one-sided Tauberian condition a, > —K, if the Abel sum of paar a,x" was A, 
then the Cesaro sum of }° a, was also A. 
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32.4 Wiener’s Proof of Littlewood’s Theorem 


Littlewood’s Tauberian theorem of 1910 was the first difficult and deep Tauberian 
result to be proved. It is therefore interesting to see how Wiener derived this theorem 
from his general theorem.*> We restate Littlewood’s result: 


(oe) CO 
If lim Yo aay" = 8 and nja,| < K, then ae Ss 
as a n=0 


The first step in Wiener’s proof was to express }*a,y" as an integral. For that 
purpose he showed that s(x) = )7,,<, an was bounded for 0 < x < oo. By hypothesis, 


ear ane * was bounded for 0 < x < oo and (using n|ay| < K) 


(oe) 
s(x) — \~ane* Yo an(l =e) = ne 
n=0 


n<x n>x 


u 


This showed that s(x) was bounded so that he had 


00 fore) 0° 
So ane" =i e “* ds(u) =) xe “*s(u) du. 
=O 07 0 


Hence, 


(oe) (oe) Hck 
s= lim xe “*s(u)du = lim ee s(e")e" dn. 
x07 Jo E>00 J_o9 


So Wiener set K,(€) = e-£e-@ * and observed that 
[oe] [oe] -t CO 
/ Ki(&) dé a. e fe  d& =| e*dx=1. 
Bis ies 0 
Thus, 


lim Ki = nstean=s f K\(§) dé 


ESO J_ 


35 Wiener (1958) pp. 104-106. 
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and 
1 cee 1 PP coe ast 1 . 
ie I) xe OS ip a 
Therefore, K1(&) satisfied the hypotheses, (32.31) and (32.32), of his first Taube- 


rian theorem. Wiener then chose K2(&) in such a manner that he obtained the (C, 1) 
summability of )> a, to s. He set 


Ki (é)e "6 dé = 


& <0, 


0 
K5(g) = 
2(&) te Ha) 


so that by his first Tauberian theorem, 
CO CO [o,@) 
sas fo etde=sf Kaedg = tim f  Ka(é-ms(e"dn 
0 —oo 500 Joo 


é 1 
= lim e”§ s(e") dn = lim - | s(y) dy. 
X>00 X Jy 


E> J_o0 


Note that by applying Hardy’s theorem that (C,1) summability together with 
an = O(s) implies convergence, the Hardy—Littlewood theorem follows. However, 
Wiener included a simple argument to prove Hardy’s theorem: For A > 0, 


(ees wi hf ri 
Ss lim z(f sydy— f s(y) dy 
x>00 AX \ Jo 0 


1 (1+A)x 1 (1+A)x 
= lim =| s(y)dy = lim: see | (s(y) — s(x)) dy }. 


x00 2X 


The condition a, = O(s) then implied the necessary result: 


1 (1+A)x 1 (1+A)x K 
Ss = pee oS 
al (s(y) — s(x) dy| < —| a: 
x<n<y 
Ld+A)x] 
K Ax |K 
5K UK ong 
ix] Le] 


[x]+1 


for sufficiently large x. Hence lim,_+o0 |s(x) — s| < 24K; or, because A was an 
arbitrary positive number, lim,_, |5(x) — s| = 0. This completed Wiener’s proof of 
Littlewood’s theorem. 


32.5 Hardy and Littlewood: The Prime Number Theorem 


In their 1921 paper, “On a Tauberian Theorem for Lambert Series,’ Hardy and 
Littlewood gave a very simple proof of the PNT based on the result that Lambert 
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summability implied Abel summability.*° As we mentioned before, their proof of the 
Lambert summability theorem employed a result, due to Landau, stronger than the 
PNT. Thus, although they did not produce a new proof, their derivation of the PNT 
insightfully reveals its Tauberian character. In this derivation, Hardy and Littlewood 
employed a number-theoretic result describing the average behavior of the arithmetic 
function d(n), the number of divisors of n. Dirichlet first proved this result by his 
ingenious hyperbola method in an 1849 paper on the average behavior of arithmetic 
functions.*’ 
Hardy and Littlewood first showed that the series 


— Any 1 
ia 


was Lambert summable to —2y, where y was Euler’s constant. Note here that the 
Lambert series could be written as 


(An) = 1er”” = 
fo) =yp ROME” _ (am era be" + eM 4...) 


n=1 n=1 


(oe) 
=y SOG. where cy, = S\(A@) — 1) =Inn—d(n). 
n=1 d\n 


Next, they observed that 


n n 
ba: =Inn!— y > d(i). 
i=l i=l 


To estimate the logarithmic term, they applied Stirling’s formula and to estimate the 
second term they used the Dirichlet divisor theorem: 


yd(i) = nnn +t (2y — 1)n + O(n). 


i=1 


These calculations gave them 


1 n 
= Soe ~ -2y asn —> OO. 
n 


i=1 


By Frobenius’s theorem, the last result implied that 


ee) 
i — ji oy as, 
Jim, £0) = im, 2 = -2y. 
n= 


36 Hardy and Littlewood (1921). 
37 Dirichlet (1969) vol. II, pp. 50-66. 
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This proved the Lambert summability of }°°° , Aw! 1 Hence, by their theorem 
that Lambert summability implies Abel summability, "the series was Abel summable 
to —2y. It was also clear that Aja) > — 1. Moreover, Hardy and Littlewood 
had earlier extended Littlewood’s theorem and this extension showed that this one- 
sided Tauberian condition was sufficient to obtain the ordinary convergence of 
Sa Aw! 1 Recall that by (32.9), the convergence of S~ ay implied that 


n= 


ay + 2az +--+ +Ndn 
>O asn>o. 


n 
For a, = oe the last condition translated to 
AQ) -—1+AQ)—-—1+4+---+A(a)-1 
>O asn->o 
n 
or 


iz 
li — A =1 
Nese WF d (n) , 


and this was equivalent to the prime number theorem. 

In his 1971 paper, “The Quickest Proof of the Prime Number Theorem,” Littlewood 
observed that in 1918 he and Hardy proved (32.17).>® He pointed out that though they 
had earlier proved the Tauberian theorem (with the one-sided condition mentioned 
above) necessary to deduce the quickest proof, they did not mention the PNT in their 
1918 paper. We note that it was this Tauberian theorem for which Karamata gave his 
nice proof, described by Littlewood as “highly sophisticated.” 


32.6 Wiener’s Proof of the PNT 


In his work on the Tauberian theorem, one of Wiener’s fundamental aims was to prove 
the prime number theorem by means of Lambert summability.*? Thus, he wished to 


determine the behavior of )/,,-, A() as x — oo from the behavior of 


ae ; 


First, Wiener observed that 


LAW; 


as x > 1. 


a "S* A(m) = Yistion 
n=1 min 
me 


ea ~ ane + 1) —Inn) 


38 Littlewood (1982) vol. 1, pp. 951-955. 
39 Wiener (1958) pp. 112-124. 
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II 
— 
| 
bay 
Me 
5 
foe 
— 
SS 
II 
— 
| Jas 
tad 
Me 
JO NS 
pc es 
i) 
— 
S| 
n——"" 
Se 
& 
= 


Note that the second line used summation by parts. Wiener next set x = e~* and 
multiplied by —é to obtain 


oo ge ns ge7§ “¢ Of 1 oe 
LAM Sri (1 e*) 0 (52) ¢ 


It followed from the right-hand side that as € — O*, the series behaved like Iné. 
Wiener therefore worked with the differentiated series. Upon differentiating the last 
equation, he arrived at 


ene — | (e-é — 1)2 


ee eT eae stad = 1 —né 
= ie (1 e + Lo(sa)e 


| ge7* oe | Ss ! —né 
» gs ee a a rae n : 


= O(1)(O(ng) + O()) 


[oe] [o,2) 
d née" ese be 
A(n)—— = A 
» (n) ant yy (n) 


+ (1+ 0()) (; + O(1)+ oxne)) 


1 
= 7+ O(lné), 
g 
as € > OT. 
Thus, he had 
oo —2né _ ,—né —né 
e e "5 + née 
li A = 1. 
ae a aL 


Wiener wrote the sum as a Stieltjes integral so that he could apply his second 
Tauberian theorem. Toward that end, he set 


Le” 
A 
=, 


n=1 
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so that the previous equation containing the limit took the form 


Og aes Hee ie 


b= : ng (e-0F — 12 dg(Inn) 
; love) e)* (e 28 * = ee * +4 ey tee *) 
rg ake (e-2* — 1)2 aay) 


To understand the next step, compare the last expression with the corresponding 
expression in Wiener’s theorem. On this basis, we can see how Wiener next wrote 


eX (e726 * _ ene Le kee 5 


K = - 
1(x) (e-* — D2 
and then 
%. Og ane ' ee = Od ge 
K dx = dé = — d 
[. ey i ciaip  % i de et—i 
—§ 
Ae ii fee ag: 
e>0+ 1 —e7é 


Thus, he had A = 1 in the hypothesis of his theorem. One may check that K(x) 
satisfies (32.35) and since g(y) is monotomic, 


n+1 n+1 
/ ldg(x)| =) dg (x). 


Moreover, the latter expression is bounded for —oo < n < oo. Finally, Wiener had 
only to check that the Fourier transform of K1 (x) did not vanish. So he computed 


=| Keds Pa ) emag 
—— xe x= 
Jn —0o ; Jon o dé e§ —1 


Lyf? cg Ns 
ea riiea ( Be em ag. 
120+ J/2nr Jo dé e— —1 
Integration by parts converted the last expression to 
. ry oo giu+a,—é . xr co | eo) 
fai Se ie i EUS oo" dé 

a>0+ JIn Jo L—ené 10+ VY2n Jo 4 

_ Atiu SQ PA+1 +i) 
me = ys net iti 


n=1 


A+ iu 
= lim A+1+iwlA+1+iu) 
sa Soe c( ) 


=iuc(1+iu)T( +iu). 
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The work of Hadamard and Vallée-Poussin showed that ¢(1 + iu) did not vanish 
for any real u, and hence the Fourier transform of K(x) did not vanish. 

Finally, Wiener had to choose K2(x) appropriately so that he got the PNT in the 
form limy-so + Seer g A(n) = 1. Note that in this application, K2(x) had to be 
continuous; it could not be the piecewise continuous function 


0, x <0, 
ena Fe x>0 


although, if allowed, this would have yielded the result immediately. So Wiener 
defined two continuous functions 


0, x <—€, 
x+e 
K(x) = gh: ( EE TLDS 
€ 
e*, 0 <x, 
0, x <0, 
x —eE 
Ky(x)= 4-e £, O<x <e, 
€ 
Pen ee 


Here he verified that 


[_ kucyax = ae and [ Kntsyax =e (1+5). 


Wiener’s second Tauberian theorem then implied 


€ CO 
1455 tim | K(x — y) dg(y) 
X70 J_oo 


ee) AE ye 
= Sin ( / e* dey) 4 i es ** aso) 
X00 -—0O x E 


_— x ae | N 
> li **dg(y) = lim — dg(l 
Fr Mee g(y) my | ndg(Inn) 


xO 


N 
__ | 
= fim —) A 2.44 
ay le 8, Cone 
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and 
[o,e) 


€ 3 
e°(- -) = lim K22(x — y) dg(y) 
2 00 


x>00 J_ 


xXx—-eE , x x— y 
fii ( i eo* dgly) + i * Ye ae()) 
x—>0O mus x—-e € 
x x XY y 
lim (/ e”* dg(y) -[ Ca — «) ae()) 
x->0O —0o x—-€ € 


N 
; as . 1 
lim | e” *dg(y) = lim y LAM. (32.45) 


Xx—>CO Joo N>oo 


lA 


Note that in the above calculation, one may use the fact that 


ge tad 


€ 


e © >0Oforx—e<y<x. 


Wiener let € — 0 in the inequalities (32.44) and (32.45) to get 


N N 
ae Gf ins, of 
1> Para ne) and | = eae) 
n= i= 


These inequalities implied that limy_, 6, x 4 A(n) existed and was equal to 1. 


This proof of the PNT used only one property of the zeta function: that it did not 
vanish on the line consisting of points with real part equal to 1. 


32.7 Kac’s Proof of Wiener’s Theorem 


The basic principle behind Wiener’s Tauberian theorem is simple but penetrating. 
Mark Kac illustrated this insight by producing a short proof of the 1928 form of 
Wiener’s theorem.*? This proof uses only Fubini’s theorem and the uniqueness of 
Fourier transforms; like Wiener’s 1928 theorem, it is powerful enough to produce the 
PNT as a consequence. 


Kac’s theorem: Suppose 
Ki(x) € L'(—00, 00), x?K1(x) € L!(—00, 00) 


and 


ki) =n Ki (x)e®* dx £0, —w<& <a. 


40 Kac (1965). 
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If m(y) is a bounded measurable function such that for all x, 


/ Kenedy =0 


then m(y) = 0 almost everywhere. 

In proving this theorem, Kac realized that the condition x*K (x) € L! implied 
that k;(€) was twice continuously differentiable. Thus, let ® be the set of all twice 
continuously differentiable functions with compact support. Since k;(€) 4 0 for all &, 
it follows that every @ € ® is of the form kiy for some y € ®. In short, kj ® = ®. 
Let @ € ® and let F be the Fourier transform of ¢; that is, let 


F(x) = / oe dé. 


Because ¢@ has compact support, F(x) is defined for all x € C and F’(x) exists. 
Hence, F is an entire function, and @ can be chosen such that F is not identically 
zero. Thus, F has only a countable number of zeros. Since F € L!(—o0, 00) and 
|F’(x)||ki(x — y)||m(y)|is integrable as a function of two variables (x,y), we can 
apply Fubini’s theorem and change the order of integration: 


o= | F(x) ( u(x ~ymQ)dy) dx 
=} Pe ( u(x — y)Fla)de) dy 
a / oD) ( / ks(@)0(6)e a) dy. 


Because kj ® = ®, we can conclude that for all @ € ®, we have 


0= / m(y) ( / (ee as ay. 


Now ® is closed under translation so that we can replace #(&) by #(€ —a@) and change 
variables to arrive at 


0= | "He ( / * o@eif at) el dy, 


for all real aw. By the definition of F’,, this gives 


0= / m(y)F(y)e'® dy 


for all real @; and the uniqueness of Fourier transforms implies m(y)F(y) = 0 for 
almost all y. Since F can be chosen to have countably many zeros, we conclude that 
m(y) = 0 almost everywhere. 
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32.8 Gelfand: Normed Rings 


Wiener derived the final form of his Tauberian theorem by means of his famous 
theorem on nonvanishing and absolutely convergent Fourier series. About ten years 
later, in 1941, Izrail Gelfand provided a short and elegant derivation of this theorem, 
based on his theory of normed rings.*! In this effort, Gelfand utilized an abstract 
formulation of the Fourier transform, now known as the Gelfand transform. In a short 
note published in 1939, Gelfand developed the elements of the theory of commutative 
Banach algebras. These are algebras B over the complex numbers, containing a 
multiplicative identity e; such algebras are complete with respect to a norm || ||, such 
that ||e|| = 1 and ||xy|| < ||x|| - ||y|| for x and y in B; thus, Gelfand named them 
normed rings. In two further short notes published in the same year, Gelfand gave 
applications of his theory to absolutely convergent Fourier series and integrals and to 
the ring of almost periodic functions. Gelfand used the work of Wiener and Pitt on 
absolutely convergent Fourier series and integrals as a springboard in his construction 
of Banach algebras. He succeeded in obtaining short proofs of the Wiener-Pitt results 
by revealing their essentially algebraic character. 

In his two-page fundamental paper of 1939, Gelfand denoted a normed ring, or 
commutative Banach algebra, by R and observed as his first theorem that any maximal 
ideal M was closed in R. His third theorem, now called the Gelfand-Mazur theorem, 
stated that R/M was isomorphic to the field of complex numbers. This theorem 
originated with the 1918 result of Alexander Ostrowski (1893-1986), student of 
Landau and Klein, that a complete Archimedean field is isomorphic to either the field 
of real numbers or the field of complex numbers. In 1938 this was generalized by 
Stanislaw Mazur (1905-1981), student of Stefan Banach, who proved that a normed 
associative real division algebra was isomorphic to the field of real numbers, or to 
the field of complex numbers, or to the noncommutative field of quarternions. In his 
1941 paper “Normierte Ringe,” Gelfand gave a beautiful proof of the particular case of 
Mazur’s theorem he needed. This proof employed Liouville’s theorem that a bounded 
entire function is a constant. 

Gelfand was then able to associate with each x € R a complex number x(M), to 
obtain a complex valued function on the set of all maximal ideals of R. He defined 
a topology on the set of maximal ideals to make the set into a compact Hausdorff 
space and the functions x(M) continuous. In his 1941 paper, Gelfand also noted the 
easily proved result that x(M) < ||x||. This depended on the lemma that for the 
multiplicative identity e and any y € R, if ||e + y|| < 1, then y was invertible. In 
fact, it is easily verified that 


y'=-(Cet+(ety)tety?+--). (32.46) 


Now observe that if x(M) = 2 € C, then x = Ae + z where z € M. Assume 2 ¥ 0,7 
because if A = 0, then the inequality |x(M)| < ||x|| is obvious. Then for y = 
we have 


& 
? 


41 For the quotation and papers referred to in this section, see Gelfand (1988) vol. 1. 
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BOD) he Ty 
||| [Ae +zll lle+yll — 


’ 


because if ||e + y|| < 1, then y would be invertible and not be in M. 

We now have the result needed to understand Gelfand’s simple proof of Wiener’s 
theorem on nonvanishing absolutely convergent Fourier series. He let R be the set 
of all functions f(t) = 7°. ane’ such that °°. |dn| < 00; he then let 
fll = ye |a,|. This gave R the structure of a commutative Banach algebra. 
Gelfand argued that for any maximal ideal M and e!’ € R, e''(M) was some complex 


number a. He then obtained 


1 = 
le"’(M)| = lal < lle"|| = 1, and fa < lle" = 1, 


and hence a = e'"0 for some real number fj. This meant that any trigonometric 
polynomial ae nan e'"' corresponded to the number ye nan e'"'0. And since the 
mapping R — R/M was continuous, to every function f(t) € R, there corresponded 
a number f (fo). Therefore, the maximal ideal M consisted of all function f(t) such 
that f(to) = 0. It followed that if a function f(t) did not vanish at any point, then 
Jf @) was not a member of any maximal ideal of R. Wiener could then conclude that 
J (t) had an inverse in R, proving the theorem. Gelfand’s proof of the Gelfand-Mazur 
theorem, that R’ = R/M is isomorphic to C, began by supposing that x € R’ and 
x Ae for any complex number A. Then (x — Ae)~! exists for every 2. Moreover, 


on ee)! Se = hey 
lim = 


Xx —2 
h->0 h a 2) , 


and 


av) > O0ari->o, (32.47) 


es 


since, for |A| > ||x||, equation (32.46) implies that ||(e — y1| < ET: Hence 
Xx 


for any multiplicative linear functional ¢ : R — C, that is, d(xvy) = (x~)d(), 
the function @((x — Ae)~!) is a bounded entire function and therefore a constant. By 
(32.47), this constant must be zero. It follows that (x — 4e)~! is zero and the theorem 
is proved by the contradiction: 


e=(x- re)! (x — he) = 0. 


F. Riesz, S. Mazur, and others studied normed rings before Gelfand, but Gelfand’s 
concept of the space of maximal ideals unified several isolated earlier results 
and opened up new avenues for further research. In fact, the space of maximal 
ideals became important in algebraic geometry also, though in that area Alexander 
Grothendieck showed that the space of prime ideals produced better results. 

In 1930, Gelfand (1913-2009) moved to Moscow from Odessa without completing 
his secondary education. He had studied mathematics on his own from an early age; 
his lack of books spurred him to great creativity. At the age of 15, for example, 
he discovered the Euler—Maclaurin formula. He studied a textbook on differential 
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calculus and Taylor series, but he had no book on the integral calculus. While 
investigating the problem of the area under y = x”, he was led to consider the sums 
1” + 2” +.---+m”. He soon found the Euler—Maclaurin formula and the generating 
function for the Bernoulli numbers by means of the Taylor series. In a similar way, 
he discovered Newton’s formula for the sums of powers in the theory of symmetric 
functions. In Moscow, he worked in odd jobs such as doorkeeper at the Lenin Library, 
while also teaching mathematics. The great Russian mathematician A. N. Kolmogorov 
(1903-1987) took an interest in Gelfand, who very soon found himself lecturing at 
the Moscow State University and studying with Kolmogorov who directed him to 
problems in functional analysis. This resulted in Gelfand’s 1935 thesis, “Abstract 
Functions and Linear Operators.” The theory of commutative normed rings was the 
subject of his 1938 doctoral thesis. 

Gelfand made major contributions to several areas of mathematics such as represen- 
tation theory, differential equations, computational mathematics, and biocybernetics. 
At the age of 80, in collaboration with M. Kapranov and A. Zelevinsky, he was starting 
to develop a theory of hypergeometric functions of many variables. Though he was 
unable to bring this work to perfection, he had seen the importance of such a theory 
when he was much younger. For example, in his 1956 lecture “On Some Problems of 
Functional Analysis,’ he gave his thoughts on the matter: 


It is known that almost all the special functions of one variable to be met with in mathematical 
physics may be obtained from the general hypergeometric function of Gauss by a suitable 
choice of parameters. These same functions appear as elements of representations of the simplest 
classical groups, namely the groups of rotations of the sphere and of the Lobacevskii plane. This 
connection lies in the nature of the matter, since the special functions make their appearance by 
way of considerations connected with this or that invariance of a problem under transformations 
of a space. Hence it is natural to construct the theory of hypergeometric functions of several 
variables, relying on results and methods of the theory of the representations of compact or locally 
compact Lie groups. It is thus necessary so to construct the theory of hypergeometric functions 
that it should contain the theory of general spherical functions, connected with the representations 
of semi-simple groups. 


32.9 Exercises 


(1) Prove that if A ,, Az, A3,...,An,... is a sequence such that the difference 
An+1 — An converges to a limit A as n — on, then An converges to the 
same limit. Cauchy stated and proved this result in his Analyse algébrique; see 
Cauchy (1989) or Bradley and Sandifer (2009) pp. 35 and 42. Show that this 
result implies that if a series }* a, converges to A, then it converges (C, 1) to A. 

(2) Prove the theorem of Frobenius that Cesaro summability implies Abel 
summability. Observe that 


[oe] [oe] 
So ax” = (1 =x)? > (Ao +--+ + An)”, 
n=0 n=0 


where A, = )-;_¢ ag. See Frobenius (1880). 
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(3) Prove Borel’s theorem that ordinary convergence implies Borel summability, 
that is, if a, — Aasn — o, then 


See Hardy (1949) p. 80. 

(4) Show that the condition in the second theorem of Tauber given by (32.9) is 
implied by the condition na, —> 0 as n — oo. See Tauber (1897). 

(5) Prove Cesaro’s theorem that if }° a, = A and )° b, = B and C, =co+c1+ 
co +++++ cy, where cy = dgby + ajbn_, ++++ + anbo, then 


CotCit---+Cn 
n+1 


> ABasn > ow. 


See Cesaro (1890). 
(6) Suppose a(x) and b(x) are continuous functions. Prove that if 


i a(x)dx = A, i b(x) dx = B, 
0 


0 
1 X t u 
then lim -| arf du | a(w)b(u — w)dw = AB. 


Next, deduce that if {9° dx [} a(t)b(x — t) dt is convergent, then its value is 
AB. 


(7) If fo? a(x)dx =A, fy b(x)dx = B, |xa(x)| < K, and |xb(x)| < K, 


—_, 


(oe) x 
then i ax | a(t)b(x —t)dt = AB. 
0 0 


The theorems in this and the previous exercise are due to Hardy. See Hardy 
(1966-1979) vol. 6, pp. 210-212. 

In 1971, at the age of 86 and in memory of his student Harold Davenport 
(1907-1969), Littlewood gave a short proof of the PNT depending on the 
following known results: 


(8 


wm 


¢ The Hardy—Littlewood theorem of which Karamata gave a two-page proof; 


The functional equation of the zeta function; 


The Cahen—Mellin integral for e~” in terms of the gamma function; 
o(s) 
g(s) 
The complex zeros p of ¢(s) have a real part between 0 and 1; 


gs) _ 1. 
ris) = O00) +520, aG=p! 


For s = —1+/it, a =O (14), where A is a positive absolute constant; 


Littlewood remarked that A would not necessarily have the same value from 
one occurrence to the next; 


The Dirichlet series for — when Re s > 1; 
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¢ If N(T) denotes the number of zeros of op = 6 + iy withO < y < T, then 
N(T) = O(T*). 

Note that the last result is extremely weak and far more is and was known about 
N(T), but Littlewood did not require a stronger result. First prove Littlewood’s 
first lemma: Given a large positive To, there is a T, with ATg < T < ATo, such 
that co — O(T4) fors =o +iT, —1<o <2. The corresponding result 
for s = o — iT follows by symmetry. Next, demonstrate Littlewood’s second 
lemma that for 


y>0O, —2zi yo Ame” = i r(s) $9) y-s ds 
2-ioo g(s) 


From this, Littlkewood deduced the PNT in one page. Show how this can be 
done. See Littlewood (1982) vol. 2, pp. 951-955. 

Show that ¢(s) has no zeros when the real part of s is one. The original proofs 
of Hadamard and Vallée-Poussin were slightly more complicated than later 
proofs. See Titchmarsh and Heath-Brown (1986) p. 48. 


(9 


wa 


32.10 Notes on the Literature 


Hardy (1966-1979) vol. 6 contains all his papers on summability or Tauberian theory, 
as well as his joint papers with Littlewood on the topic. Littlewood (1982) vol. 1 
includes his papers on this subject that were not joint with Hardy. 

All of Wiener’s work on Tauberian theory can be found in Wiener (1976-1985) 
vol. 2. Most of Wiener’s work discussed in this book was taken from Wiener (1958), 
a course of lectures on Fourier transforms and their applications, given by Wiener at 
Cambridge University in 1932. Norman Levinson, a student of Wiener, has written 
an interesting article explaining Wiener’s progression from harmonic analysis to 
Tauberian theory. This appeared in the Bulletin of the AMS, vol. 72, 1, part II and this 
volume also includes very helpful accounts by experts on Wiener’s many contributions 
to mathematics. Masani (1990) is a comprehensive biography of Wiener, discussing 
his mathematical work, its myriad applications, the development of his thought, and 
giving good references. 

In his original paper on the short proof of Littlewood’s theorem, Karamata also 
introduced the concept of majorizability to obtain a new condition for the convergence 
of Abel summable series. Although this idea sheds light on Karamata’s proof, the 
portion of the paper dealing with majorizability was removed by Landau and the paper 
was reduced to Karamata (1930). See Nikoli¢ (2009). 

Tucciarone (1973) gives a helpful history of summable series, from their origins 
through the 1920s; it includes a good bibliography. Korevaar (2004) is an encyclopedic 
treatment of Tauberian theory, covering a century of developments, with numerous 
historical comments and references. 


33 


Elliptic Functions: Eighteenth Century 


33.1 Preliminary Remarks 


In 1847, Jacobi wrote to Fuss that Euler had been motivated to found elliptic function 
theory by reading Count Fagnano’s Produzioni Matematiche.' Indeed, in the work 
of the then unknown Fagnano, Euler discovered the key to the apparently intractable 
elliptic integral. Giulio Carlo Fagnano (1682-1766) studied theology and philosophy 
in Rome but avoided mathematics, though he was encouraged to study it. Many 
years later, after reading Malebranche’s Concerning the Search for Truth, he taught 
himself mathematics with great devotion, and from 1714 to 1720, he published some 
interesting papers on integrals in little-known Italian journals. In 1718 he published 
his now-famous paper on dividing the lemniscate into several equal parts. His results 
were not noted at first, but were brought to light by an interesting chain of events. 
In the early 1740s, Fagnano was consulted concerning the possible instability of 
the dome of St. Peter’s. In 1750, in compensation for his help, Fagnano’s collected 
papers were published at the order of Pope Benedict XIV. Fagnano then applied 
for membership in the Berlin Academy. Euler was assigned the task of evaluating 
the quality of the mathematical portion of Fagnano’s papers. Euler was intrigued by 


Fagnano’s results on the lemniscatic integral aks 3 these results inspired some of 
1-x 


Euler’s most brilliant work on integral calculus, laying the foundation for the theory 
of elliptic integrals and functions. It goes without saying that Fagnano was admitted 
to the Academy. 

The early work of Jakob and Johann Bernoulli” on the lemniscatic integral f 


dx 
1—x4 
led Fagnano to investigate the topic. The equation of the lemniscate is given in 
cartesian coordinates by 


(x? + y*)* = a?(x* — y?) (33.1) 


! Fagnano (1911). 
2 Bernoulli (1744) vol. 1, pp. 601-612 and Bernoulli (1742) vol. 1, pp. 119-122. 
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and in polar coordinates by 
r* =a’ cos 26, (33.2) 


where a is a constant. The graph of the lemniscate resembles the symbol for infinity, 
the diameter of one side given by r = a when @ = 0. For convenience, take a = 1. 
The cartesian coordinates x and y are then given by 


2x7 =r? +74, (33.3) 
iy ay or", (33.4) 


where 0 < r < 1. A simple calculation shows that if s denotes the arc length of the 
lemniscate, then 


dr 


ce 
Ji-r 


or 


w= f 4 (33.5) 
SIF) = ‘ . 
0 V1—4 

The lemniscate appeared in Jakob Bernoulli’s solution to his own 1691 problem 
on the shape of an elastic band constrained by its own weight. Johann Bernoulli 
later encountered the same curve when he asked how to find a curve such that the 
time taken to traverse it was proportional to the distance from a fixed point. Jakob 
Bernoulli opined that the lemniscatic integral could not be evaluated in terms of 
the inverse trigonometric, logarithmic, or rational functions. However, he offered the 
series expansion 


1 at TE GeO 
/ = ees), (33.6) 
0 


J1 — t4 6 n!2"(4n + 1) 


obtained by expanding the denominator by the binomial theorem and integrating term 
by term. The Bernoullis also investigated the problem of bisecting the arcs of curves 
such as the parabolic spiral. 

Fagnano’s most famous accomplishment was to bisect an arc of a lemniscate and 
trisect and quinsect the full arc from r = 0 to r = 1.° His methods were such that 
these procedures could actually be accomplished using a straight edge and compass. 
His proofs were based on obtaining appropriate changes of variables and on trans- 
forming lemniscatic integrals into other lemniscatic integrals. For example, he found 


that if 
1 

pS 14/14, (33.7) 
r 


3 Fagnano (1911) vol. 2, pp. 293-297 and 304-313. 
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then 
d 2dt 
a eee (33.8) 
Jlar* fla 
and if 
2 1 
nee r4, (33.9) 
1-—ut r 
then 
dt 2du 
= : (33.10) 
Jl—r4 V1—u4 
In order to better understand (33.7), write it as 
eee (33.11) 
i F ; 
1+4144 


Fagnano may have made this substitution on the basis of a similar transformation 


> (2b 


Seer 33.12 
lead RORY ( ) 


used to rationalize the integrand for arscine: arcsin x = te To Note that from the 
l-r 


substitution given by (33.12), we have 


dr _—s 2 dt 
JI—r2 1407 
It was thus natural for Fagnano to consider (33.11), even though it did not 


rationalize the integrand, so that he instead obtained (33.8). To understand (33.9), 
compare it with (33.7) to get 


(33.13) 


po. (33.14) 
“teu” : 
Making this substitution is only reasonable, since it produces 
dt 2d 
siz (33.15) 


JIi¢+he Jt—ut 


This means that if substitutions (33.11) and (33.14) are applied successively, the 
result is (33.10). Moreover, the relationship between r and u can be expressed by 


2 4u*(1—u') 


= Gq eae (33.16) 
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Thus, any arc in the first quadrant of a lemnisate, with an endpoint at the origin, can 
be bisected using straight edge and compass. See this by observing that the arc length 
of the lemniscate is given by (33.5) and that the arc length corresponding to the radius 
vector r is double the arc length given by u, where r and u are related by (33.16). One 
may check that r and u can be obtained from one another by solving only quadratic 
equations. Also recall that one may use (33.3) and (33.4) to obtain the coordinates of 
the points from the radius vector. This shows that we have geometric constructibility. 


J2dt 


1+14 
appear to take the form of an arc length of a lemniscate. Watson and Siegel have both 


Fagnano’s use of in (33.8) may seem peculiar, since this expression does not 


explained this very nicely* in terms of later ideas due to Gauss and Abel. Set t = ety 
in (33.11) so that 


2 
Pie 2iv 
1—v4 
and 
d 1+i)d 
Ege (33.17) 
Vi-r4 V1 —v4 
Moreover, by (33.14) 
=u" 
2 
= 33.18 
1—u4t ( 
and 
d 1—i)d 
2 (33.19) 
Vvl-vt  V1—ut 


Now note that these transformations produce points on the lemniscate, but they are 
imaginary points. Thus, Siegel points out that (33.17) and (33.19) are examples of 
“complex multiplication” of the lemniscatic integral and, when applied successively, 
produce the bisection. Indeed, Fagnano was familiar with the use of complex numbers 
in integrals; he discovered that 


/ dt \ 1+it a 
——_. = 10 " 
14+ 72 8 1—it 


a result also published by Johann Bernoulli in 1702, as mentioned in Chapter 12. 
Fagnano noted the amusing particular case 


5a (3) 
x = 2i log -]}. 
1+1 


4 Watson (1933) and Siegel (1969) vol. 1, pp. 1-7. 
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Upon reading Fagnano, Euler perceived that the doubling of the arc length of the 
lemniscate corresponded to the double angle formula for the sine function. This in turn 
was a particular case of the addition formula for the sine function. Thus, he gradually 
understood that Fagnano’s transformation formulas might be particular cases of an 
addition formula for elliptic integrals. Euler’s earlier efforts to evaluate these integrals 
in terms of elementary functions having reached a dead end, he sensed in the work of 
Fagnano an innovative and productive direction for the theory of elliptic integrals. 

Consider Euler’s state of mind when he began reading Fagnano. As a student of 


Johann Bernoulli, he knew that the integral _[ ae could probably not be evaluated 
—t 


in terms of logarithms or inverse trigonometric functions. Then in 1738, he reproved 
Fermat’s theorem that the equation z = x* — y* had no nontrivial integer solutions. 
Note that this was one result in number theory for which Fermat wrote down a proof! 
Now the substitution (33.12), rationalizing dt also provided the rational solutions 


Jf 1-12” 


of z* = 1—x? or the integer solutions of 27 = y*—x?. Euler realized that 


dt 


/ 1-14 


not be rationalized by substitution, since that would imply that Fermat’s equation 
could have integer solutions, a contradiction. 

Here note that Euler was well aware of the connection between Diophantine 
equations of the form y? = ax* + bx +c and the integration of expressions of the 
form Vax? + bx +c. In a 1723 letter to Goldbach, Daniel Bernoulli made specific 
mention of this connection and so did Johann Bernoulli in his integral calculus 
lectures, published in the 1740s, long after he delivered them. See Section 12.8 in 
this connection. Thus, it is safe to assume that Euler was aware that the elliptic 
integral could not be evaluated in terms of elementary functions. He was searching 
for a new path, and found it in Fagnano, upon whom he heaped praise. Within a few 
weeks of receiving Fagnano’s Produzioni Matematiche, Euler gave a favorable report 
to the Berlin Academy, including some of his own reflections. He soon wrote a paper 
reworking and generalizing Fagnano’s results and then went on to publish several more 
papers, which now fill two volumes of his Opera Omnia. 

Euler’s papers and letters to Goldbach indicate that he saw a close connection 
between [ dt__ and fi a_ Ty fact, in his letter of May 30, 1752,° Euler 


a/ 1-t4 


mentioned to Goldbach that 


could 


dx dy 
= 33.20 
Jl—xx Jl—yy : 
had the complete integral 
yy +xx =cc+2xy/(U — cc), (33.21) 
while 
dx dy 
= (33.22) 
VJl-x4 J/1—y4 


5 Fuss (1968) vol. 1, pp. 564-568. 
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had the complete integral 
yy +xx =cc+2xyJ(1 — c+) — cexxyy. (33.23) 


Now from (33.16) we see that 


i i =2[° gs (33.24) 
a Vt= Jo n/t #4 
when 


Quv1 —u4 
C= 


33.25 
1+ut 


The corresponding result for the arcsine function is 


ie i =2 f° idl (33.26) 
o V1-# 0 V1—#2 
when 


r = 2uv1—u?. (33.27) 


These two relations are equivalent to the double angle formula for sin x, that is, 
sin 2x = 2sin x cos x. And this is in turn a particular case of the addition formula 


sin(x + y) = sin xcos y +cos xsin y. 
Next write this in terms of integrals as 


“dt i dt ia dt 
| = as for ea ee EL Se 
i V1 —t? 0 V1—f2 0 Vv¥1—f2 
(33.28) 


Recall that Euler thought that he could view Fagnano’s bisection of the lemniscatic 
arc as a particular case of a possible addition formula for the lemniscatic function. In 
1753, Euler found the required addition formula:® 


_ uvl—vt+ovv1—ut 


su) +8) =s(r), r= ae (33.29) 


where s(u) was the lemniscatic integral defined by (33.5). To understand the method 
by which Euler obtained (33.29), note first that upon integrating (33.20), one obtains 


arcsinx = arcsin y + arcsinc = arcsin(yv 1 — c? +c,/1— y?), 


6 Eu. 1-20 pp. 58-79, especially pp. 65-66. E 251. 
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implying that the complete integral of (33.20) is 


x=yvV1—c?+c/1-—y?. 


This is actually the addition formula for the sine function, also given by (33.28), and it 
is equivalent to the complete integral (33.21) from Euler’s letter. In a similar manner, 
Euler derived the addition formula for the lemniscatic function (33.29) by solving 
equation (33.23) for x or for y. None of his papers on this topic give an account 


of how he found the complete integral; they merely verify that the differential = = 
—x 


remained invariant under the transformation obtained by solving (33.23) for y in terms 
of x. 

Euler also extended (33.29) to the more general quartic? P(x) = 1 + mx? + nx’*. 
He proved that the complete integral of 


dx dy 


VP(x) Py) 


turned out to be the equation 
—nerx?y? +x? + y? = c? + QxyV1 + mc? + nc4, 
where c was an arbitrary constant. Upon solving for y, one obtained 


= x P(c) + cV/ P(x) 


1 —nc?x2 


Euler obtained the addition formula for the case in which P(x) was the general 
quartic A+2Bx+Cx?+2Dx>+ Ex*. By means of a fractional linear transformation, 
he reduced the general quartic to the particular case 1+-mx?-+nx*. The slight drawback 
in Euler’s technique was that it introduced complex coefficients, whereas he intended 
to use only real coefficients. This lacuna was filled by Legendre in a paper of 1792.8 
Euler proved these results on the addition formula during the 1750s; during the next 
twenty years, he went on to prove similar results for elliptic integrals of the second 
and third kinds, to use terminology introduced by Legendre. 

This body of Euler’s results brought the theory of elliptic integrals to prominence, 
not only in the context of the integral calculus and Diophantine equations, but also 
in areas of applied mathematics such as elasticity and dynamics, where numerical 
evaluations were paramount. Since elliptic integrals could not be evaluated in terms 
of elementary functions, numerical methods were sought. The early work of Jakob 
Bernoulli showed this to be a tough problem. His 1694 paper discussed the elastic 
curve defined by 


f= / ei (33.30) 
~ Jo /1—t4 ; 


7 ibid. pp. 63-67. 
8 Legendre (1792) pp. 9-10. 
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with arc length s(x) given by (33.5). Bernoulli determined the intervals within which 
the values of f(1) and s(1) would have to fall. In a 1704 paper, Bernoulli was able use 
series methods to specify these values within shorter intervals: 1.3088173 < s(1) < 
1.3152635 and 0.5983546 < f(1) < 0.6004034. But Bernoulli’s hypergeometric 
series (33.6) did not converge rapidly enough, so that these values were not too 
accurate. Then, in his Methodus Differentialis, James Stirling produced a vastly better 
evaluation, correct to fifteen decimal places. 

Recall that Stirling’s book contained several methods for transforming hypergeo- 
metric and other series into more rapidly convergent series. In proposition 11 of his 
book,’ he applied a specific method to Bernoulli’s hypergeometric series, obtaining: 


1 
d 
/ _*"_ = 1.31102877714605987, (33.31) 
0 1 —x?4 
1 2 
/ oe = (.59907011736779611. (33.32) 
0 —xX 


Incidentally, it was a source of pride to Euler when he found, through his 1737 work 
on the elastic curve, that the product of these two integrals was exactly 7. To get this 
result today, one would evaluate the beta integrals in terms of the gamma function. 
And it is interesting that soon after 1737, Euler found this method of evaluating in 
terms of the gamma function. Euler also used the series method to obtain numerical 
approximations of some elliptic integrals. Again, these results were not accurate to 
many decimal places because the series did not converge rapidly enough. Using the 
transformation of integrals, Lagrange, Legendre, and Gauss found better methods, 
also of great theoretical significance. But these methods owed a debt to the work of 
John Landen. 

In 1771, John Landen presented a fundamental transformation of elliptic integrals; 
he elaborated on this result in another paper published four years later.!° He stated 
his problem in geometric terms, expressing the length of the arc of any hyperbola in 
terms of two elliptic arcs. The Landen transformation can be stated, as formulated by 
Legendre,!! as the theorem: If sin(2Q¢ — 0) = ksin6@, then 


Nin 


aty fra k sin? u)~2 d of 1 4k int?u) du. (33.33) 
= sin => — — —~ sin 5 . 
: u u P +ke u u 


We note the important particular case 


Nic 


a+n fa k2 sin? @)~2 dO of (1 pie oe 6) d@. (33.34) 
= sin — = sin % . 
0 0 (1 +k)? 


9 Stirling and Tweddle (2003) pp. 74-75. 
10 Landen (1771) and (1775). 
'l Legendre (1811-1817) vol. 1, p. 81. 
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In Legendre’s notation 


z Pe. 
Ki = f (1 — k? sin? 6)~2 do (33.35) 
0 


so that (33.34) can be expressed as 


2/k 
K (4) =(1+k)K(k). (33.36) 


Note that (33.36) is also referred to as Landen’s (quadratic) transformation. This 
and Euler’s addition formula were the two pillars on which the early theory of elliptic 
integrals and functions was constructed. Mittag-Leffler insightfully wrote, “Landen 
does not seem, however, to have fully understood the value of his discovery.” !* 

But Lagrange quickly grasped the applicability of Landen’s transformation to the 
numerical approximation of elliptic integrals, and in this connection, he discovered the 
concept of the arithmetic-geometric mean of two numbers. In 1784-1785, Lagrange 
presented these ideas in the Turin Academy journal under the title “Sur une nouvelle 
méthode de calcul intégral.”!? Lagrange was then a member of the Berlin Academy, 
but was born in Turin and was a founder of its academy. Consequently, he was quite 
interested in the growth of the Turin Academy and published several papers in its 
journal. In his paper, Lagrange expressed Landen’s transformation in the elegant form: 


Ifp>q>0, 
pi =pt+yp?-@, g' =p-yp?-@? (33.37) 
and 
R(p,.4,y) = Ja £pyAdtgy, y= 25, (33.38) 
can : it qzy2’ 
then 
dy _ dy’ 
eters 33.39 
R R’ ( ) 
where R’ = R(p’,q’, y’). 
pi+q' 


He also observed that p= , the arithmetic mean of p’ and gq’, and that 


2 
q = Vp’q’, the geometric mean of p’ and q’. He used these relations to define two 


sequences. Let po = p,go = q with p > q and for any positive or negative integer n 
set 
Pe gO Qe rt Ha) Bee as (33.40) 


12 Mittag-Leffler (1923). 
13 Lagrange (1867-1892) vol. 2, pp. 281-312. 
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or 


— Pati Qntl 


Pn 7 Qn = J Pn+19n41- (33.41) 


The relations (33.40) and (33.41) define the bilateral sequence 


eres P-1; q-1, PO, 0, Pi, q1, Siguietge 


In the positive direction, for increasing n, the g, terms tend to zero and py 
terms tend to infinity. So for the purpose of approximate evaluation of the integral, 
whose form did not change by (33.39), Lagrange took g, = 0 for large enough n. 


This reduced the elliptic integral to [ Jia an exactly computable integral. In 
+ pax 


the negative direction, Lagrange observed that the arithmetic means p_, and the 
geometric means g_, had the same limit, because p_» — g-n ~ Oasn > ow. 
Thus, for sufficiently large n, the elliptic integral could be approximately evaluated 


from the exactly computable integral [ aa a: 


Apparently Lagrange did not enjoy numerical calculation as much as Newton, 
Stirling, and Euler did, so he did not actually apply his method to find approximate 
values for elliptic integrals. This was left to Legendre, who effectively used iterated 
forms of (33.33) and (33.34) to construct numerical tables of such integrals. 

We mention in passing that Euler used results obtainable from the addition formula 
to study Diophantine equations of the form y? = p(x) where p(x) was a polynomial 
of degree four with integer coefficients. Since Euler did not note the connection with 
elliptic integrals, we are not certain that he was aware of it. In 1834, Jacobi reviewed 
some of these papers of Euler and on that basis, he concluded that Euler knew of 
this relationship. In a similar way, Euler as well as Lagrange employed quadratic (or 
second-order) transformations (isogenies) to study the special Diophantine equations 
gz? = x4 + y4 and 2? = 2x4 — y*. This may be one reason that Lagrange did 
not refer to Landen in his 1784-85 paper on elliptic integrals. Another reason, of 
course, is that mathematicians in the eighteenth century were not in the habit of 
giving an exhaustive list of references. Thus, Mittag-Leffler assumed that Landen and 
Lagrange had independently discovered the Landen transform. In fact, on January 3, 
1777, Lagrange wrote to his friend Condorcet that he had seen Landen’s 1775 paper 
containing the theorem reducing the problem of the rectification of arcs of ellipses to 
a problem of hyperbolic arcs.!4 Lagrange wrote that he found this a singular result 
and that he had not yet verified it. Apparently, he found the time to study Landen and 
went on to discover the arithmetic-geometric mean and its use in numerical evaluation 
of integrals. It is remarkable that he did not do more with it. Perhaps he was already 
beginning to lose interest in mathematical research. After 1785, he produced no further 
original mathematical results, though he did publish important and influential books, 
including Mécanique analytique of 1788 and Fonctions analytiques of 1797. 


'4 Lagrange (1867-1892) vol. 14, p. 41. 
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33.2 Fagnano Divides the Lemniscate 


In 1691, Jakob Bernoulli observed that the arc length of the parabolic spiral (a—r)? = 


2abé@ was given by 
| r2(a—r)* 
s= / 1 + - epee dr, 


and that the integrand was an even function of 5a —r. 


Therefore, he had 
i. jen 
5a—c 5a 


He could then conclude that the length of the arc of the spiral joining the points 


corresponding to r = 5a —candr = 5a equaled the length of the arc from r = 5a 


tor= 5a + c, the two arcs were incongruent. 
Fagnano extended this and other of Bernoulli’s results to arcs of other curves, 
including the lemniscate, given by 


2 


(x? + eye =4-S ue or r° =cos 26. 


In 1718, Fagnano published his two-part work on the division of the lemniscate.!° 
In the first part he stated that if 


/ ae) 
pe Aa (33.42) 
+Z 


then 


dz du 
= : (33.43) 
visa J fi-¥ 
Fagnano’s statement of this result took an apparently more general form, but we can 
obtain it from this one by replacing u and z by # and 4, where a is a constant. He 
observed that the result could be proved by differentiating (33.42) and substituting in 
(33.43). As an immediate consequence of this theorem, it was clear that 


we +ur+e°-1=0 (33.44) 
was an integral of the equation 


dz du 


=+ ‘ 
me i ee elt 


(33.45) 


15 Fagnano’s results discussed here can be found in Fagnano (1911) pp. 293-297, pp. 304-313. 
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P. @) 


Figure 33.1 Fagnano’s lemniscate. 


As mentioned earlier, Euler noticed this fact very quickly. We can write the theorem 
in modern form as 


z 1 
y a = / a ; (33.46) 
o V¥1—t4 Ju V1—¢#4 
when u and z are related by (33.42) or (33.44). This means that in Figure 33.1, if 
O, P,Q, and A denote points on the lemniscate corresponding to the values 0, z,u 
and 1 of the radius vector, then arc O P = arc QA in length. 
In the last section of part 1 of his paper, Fagnano observed that the full lemniscatic 
arc OA would be bisected if the points P and Q coincided. This would happen when 
z = u and then (33.44) would imply 


z+ 4274-1=0, or z=u=yv2-1. 


Thus, this constructible number would express the distance between the point of 
bisection and the origin. 

Fagnano started the second part of the paper on the division of the lemniscate with 
the theorem: If 


f-—t=2 
ere Soca aa (33.47) 
Zz 


then 
+dz = dxJ/2 
ag sla 


His proof consisted of the observations that 


_ +dzV1=V/1— 24 
2/1—4A 


(33.48) 


dx 


and 


V1+x4  viFVv1-—< 
V2 2 , 


33.2 Fagnano Divides the Lemniscate 225 


Note that in (33.48) the differential on the right-hand side is apparently not a 
lemniscatic differential. So Fagnano stated another theorem: If 


uv/2 


; 33.49 
Jr ees 


KS 


then 


du 7 1 . dx 
VIHA eae 


Once again, his proof simply noted that differentiating (33.49) resulted in 


(33.50) 


dx du Lat 
= x ; 
Jo fLaGs 1H 
and that (33.49) also implied 


1+u* 
4_— 
Vl+x =F Z 


—u 


By combining these theorems, he obtained the result on the duplication of the 
lemniscatic arc starting at the origin: If 


2 1 
uv? _ a ee (33.51) 


V1—u4 Zz 


then 


dz - 2du 
JLo Wau 


Note that if P corresponds to z and Q to u, and if z and uw are related by 
(33.51), then (33.52) shows that arc OP =2 arc OQ. This means that if the value 
of z is given, then u can be obtained by taking square roots and conversely. Hence, 
duplication and bisection can be done by straight edge and compass. Fagnano made 
the observation that (33.51) was equivalent to the relation 


QuVv1 —u4 
aa a 
l+u 


(33.52) 


(33.53) 


Recall that Euler saw this result as the extension of the double angle formula for 
arcsine, and it was perhaps this result that led him to the addition formula for the 
lemniscatic integral. 

Fagnano trisected the full arc OA of the lemniscate by combining (33.42) and 
(33.43) with (33.51) and (33.52). To obtain the trisection in a simpler form, he 
presented another transformation: If 
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(eee fi 
iJ2 2 


1 ica (33.54) 


then 
dz 2dt 
VI-4 1-4 

He then noted that he could obtain a point of trisection by setting f = z and that 
the trisection point would be given by tf = V2,/3 — 3. One may check that for t = z, 
(33.54) simplifies to the equation t° + 64 — 3 = 0, and that 2./3 — 3 is a solution of 
x*+6x—-3=0. 

Fagnano went on to work out how the arc OA could be divided into five equal parts. 
He did not write down the details, but, based on his trisection method, the method 
would probably begin by taking points on the lemniscate for which the distances from 
the origin are t,z,v, and uw such that arc Ot = 2 arc Oz, arc Oz = 2 arc Ov, and arc 
Ov =arc uA. Fagnano’s formulas give the relations connecting ¢ with z, z with v, and 


v with u. Finally, if we take t = u, then we get arc Of = 4 arc uA and the equation 
for t reduces to 


(33.55) 


124 + 50129 — 125r!© + 300r!2 — 10548 — 6274 +5 — 0. 


Although Fagnano did not publish this equation, it is likely that he obtained and 
solved it. Gauss derived the equation, and it is explicitly given in his collected works.!® 
The twenty-fourth degree polynomial has factors 


re —244+5 and 124+ (26+ 12V5)t*+9 +45, (33.56) 


where we choose either both plus signs or both minus signs. Note that the first 
polynomial in (33.56) has only complex roots, but the real roots of the other two 
polynomials can be expressed in terms of square roots and are therefore constructible. 
The quinsection may be obtained by solving the polynomial with both negative 
signs. Fagnano stated the corollary that the quadrant of the leminscate could be 
divided algebraically into a number of equal parts if that number were of the form 
2x 2",3 x 2,5 x 2” for any positive integer m. He wrote that this was a “new and 
singular property” of his curve. 


33.3 Euler: Addition Formula 


Although Euler quickly perceived the importance of Fagnano’s work on the lemnis- 
catic integral, he could not at first locate any fundamental guiding principle among 
the large number of apparently ad hoc transformations applied somewhat randomly. 


16 Gauss (1863-1927) vol. 3, pp. 404-405 and vol. 10, p. 162. 
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It took him a little while to discover the required unifying ideas: the addition formula 
and the complete integral for the equation!’ 


mdx _ ndy (33.57) 


J/1—x4 flay 


It appears that Euler spotted a hint: Fagnano’s result that 


CRs or a7 a er TSO 


actually gave a special integral of the preceding differential equation, when 
m = n = 1. In his first paper on this topic, presented to the Academy in 1753, 
Euler gave some preliminary results on this hint.!® But soon after this, as his letter to 
Goldbach indicated, he discovered the general algebraic integral and published it in 
his 1753 paper. In the first theorem of this paper, Euler took m = n = 1 and stated 
that the differential equation 


dx dy 


Tie = Fo (33.58) 


had the complete integral 


xx + yy + cexxyy = cc + 2xyv (1 - c+). (33.59) 


Here note that by taking c = 1, one obtains Fagnano’s result (33.44). 
Euler argued that taking the differential of (33.59) gave 


xdx + ydy + ccxy(x dy + ydx) = (xdy + ydx)J/(1 — c4), 
and hence 


dx(x + cexyy — yV¥ (1 — c4)) + dy(y + cexxy — xv (1 —c4)) = 0. (33.60) 


He solved (33.59) as a quadratic in y, choosing the signs of the square roots so that 
y =c when x = 0. He similarly solved it as a quadratic in x. Thus he got 


x/(1— ct) te (1 = x4) yV(1 = c4) — e/0 — y4) 
y= and x= : 


1+ccxx 1+ccyy 


These equations implied that 


x +cexyy = yV¥(1—c4) —e/ — y4), 
y t+ eexxy — xV/(1— ct) = eV (C1 — x4). 


!7 Bu. I-20 pp. 58-79. E 251. 
18 ibid. pp. 80-107. E 252. 
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Euler substituted these relations in (33.60) to obtain 


—cdx,/ (1 — y4) +cdyV/ (1 — x4) =0. 


This was equivalent to (33.58), and the theorem was proved. Euler then noted that 
this theorem was equivalent to the formula 


[ dt +f dt =| dt (33.61) 
oVI-# Jo VI-# Jo VI—-# 


with 


WV = A +eV 0 =u 


33.62 
14+ c2u2 ( ) 


Note that (33.61) and (33.61) represent the famous addition formula for the 
lemniscatic integral and it generalized Fagnano’s duplication formula obtained by 
taking u = c. Thus, Euler saw that the transformation that left the differential ae 


v 1—x4 


and proved in a 


invariant also provided the addition formula. 


Euler then considered the more general differential a 


a/ 1+mx2+nx4 


similar way that it remained invariant under the transformation 


cc — xx — yy +necxxyy + 2xyV (1 +mcec +nc*) = 0. 


This in turn yielded an appropriate addition formula for this more general elliptic 
integral. 


33.4 Cayley on Landen’s Transformation 


Landen’s exposition of his transformation is not easy to read; in fact, G. N. Watson 
aptly described it as “clumsy.” However, Cayley’s 1876 text (reprinted in 1895) on 
elliptic functions, described in his preface as “founded upon Legendre’s Traité des 
fonctions elliptiques and upon Jacobi’s Fundamenta Nova, and Memoirs by him in 
Crelle’s Journal,’ presents Landen’s work in more felicitous notation and in such a 
manner as to outline its geometric underpinnings and make clear its essential and 
useful features. 

Summarizing Cayley, with reference to Figure 33.2, we begin by taking a point P 
on the circle with center O and another point Q, on the diameter AB. Set 


OA =a, OB =b, AOP =, ABP=$¢ 


so that AO P = 2@. Now let 


b —b 
a=, bi = Vab, a=S . 


19 Cayley (1895) pp. 327-328. 
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A 


B 


Figure 33.2 Cayley’s diagram for Landen’s transformation. 


Then 


a—b 


OA=OB=OP=Qq, OO a Be a 


OP sin ¢; = a, sin 2¢, 
OP cos gj = ci + a1 cos 2¢, 


OP? -_ ct + 2c,a, cos 26 + a; 
1 1 
= 5 + b*)(cos” @ + sin? 6) + 5 — b’)(cos” @ — sin’ ¢) 


=a’ cos” ot+ b* sin? op. 


Therefore, 
in 2 2 
ee a, sin 26 Pa ci t+ a, cos 26 
je cos? @ + b? sin* b je cos? @ + b? sin? b 
(33.63) 
2(acos* @ + bsin’ o)” 
a cos’ ¢ + he sin? $| = ae 3 ba 
a? cos? @ + b? sin* & 
A simple calculation produces 
1 — b)sin 2 
sin(2p — $1) = 5° ce) (33.64) 
j? cos? @ + b? sin* b 


1 
cos(2@ — $1) = — - ja cos? ¢ + bt sin? . 
a| 
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Take a point P’ on the circle close enough to P so that we can regard PP’ as 


tangent to the circle. An elementary geometric argument shows that 
P Qdg, = PP’ sin P' PO = 2ad¢ cos(2¢ — 41). 
This is equivalent to 


2d¢ = dg 
je cos? ¢ + b? sin? b (4 cos? ¢ + bj sin? $4 


Thus, if @ and ¢; are related by (33.64) andO < @ < > then 


[ dt _ ae dt 
0 Va2cos2t + b? sin? t 2 Jo ya?.cos?t + BR sin? 


In particular, 


t 1 


ie d i dt 
0 Varcos?t+b2sin?t 2 Jo 4} cos? ¢ + b? sin? 1 


iG dt 
0 (3 cos? t + bt sin? t 


Using the notation of Legendre, we set 


be b bt 
P=l-—5, k= - and =1- 4. 
a a ay 


Then (33.66) can be written as 


i dt ee fe dt 
0 Vl—ksin2t 2 41 Jo 1 —kisin*t 


1 o1 dt 
= phere —o————— 
0 Jl—k sin’ t 


(33.65) 


(33.66) 


(33.67) 


(33.68) 


note that this is in fact Landen’s transformation (33.33). To see this, compare the first 


equations in (33.63) and (33.64) to get 
sin(2¢ — ¢1) = ki sing). 


Now (33.33) follows from (33.68) by noting that asby = k?. 


(33.69) 
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Finally, we observe that if we write y = sin dg; and x = sin 4g, then (33.69) can be 
expressed as the quadratic transformation 


(+ k)xV1 — x2 
y= 
V1 — k*x2 


(33.70) 


and (33.65) takes the form 
(1 + ky) dy = 2dx 
Jay — Ky) V0 = 29) = x?) 


(33.71) 


33.5 Lagrange, Gauss, Ivory on the agM 


Lagrange was the first mathematician to observe the connection between the arith- 
metic geometric mean (agM) and elliptic integrals. His 1784-1785 result (33.39) 
essentially expressed this connection. He made this discovery as he pursued a 
numerical method for evaluating elliptic integrals, and he did not further investigate 
the concept of the agM. In his 1818 paper on astronomy,”° Gauss gave a formula 
relating the agM of two positive real numbers with an elliptic integral and he worked 
out an extensive theory on this topic, but published very little of it. Gauss denoted the 
agM of two positive numbers a and b, with a > b, as M(a,b). He considered the 
sequence 


b b 
es at ee ae, by = Jaibi, etc, (33.72) 


and noted that 


b<by < bp < +++ < by <3) <a <+++ X Qt <a. 


He observed that if a = b, then a, = b, for all n. On the other hand, if a > b, then 


an — by Gn—1 — bn Gn—1 — bn—1 


an—1 — bn : A4(ay + bn) - 2(an—1 + bn—1) + 4bn 


and hence 


Se ee 
On — by < a gS. (33.73) 


Consequently, the increasing sequence b, and the decreasing sequence dy 
converged to the same number denoted by M (a,b). 
Gauss also noted the simple properties of M (a,b) given by the equations 


M(a,b) = M(aj,b1) and M(na,nb)=nM(a,b) foranyrealn > 0. (33.74) 


20 Gauss (1863-1927) vol. 3, pp. 331-355. 
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From these equations he deduced a number of relations; for instance, for x = eect 
he had 
M(1+x,l—x)=M ee : MA=77,147) (33.75) 
x,l-x)= ‘ — ,i- . : 
14+ 72 1+ 72 
Gauss’s theorem, published in 1818 but proved much earlier, stated that 
1 2% dé 
= i : (33.76) 
M(a,b) «Jo /a2cos26 + b2 sin? 6 


To see how Lagrange’s transformation implies this theorem of Gauss, denote the 
integral in (33.76) by I (a,b) and set x = cate to get 


2. fe dx 
I (a,b) = i ; (33.77) 
TJo (1 +a2x2)(1 + b2x?) 
Recall Lagrange’s result (33.39), that if 
1+ ayy? 
x= 33.78 
Way Pe ) 
then 
d d 
——_____ = (33.79) 
Vd +a2x y+ b*x*) (G+ azy)(l + B2y?) 
When this result is applied to (33.77), we see that 
I (a,b) = I(a1,b4). (33.80) 


Upon iteration, we conclude that if c = M(a,b), then 


2% dx 1 
(a, ) (an, n) (c, c) XT i 1 + c2x2 Cc 


This proves Gauss’s theorem. Note that (33.67) is identical to (33.80). 
Gauss derived (33.76) by means of a different transformation. He set 


; 2a sin 6’ 2a sin 6’ 
sind = y= = 
(a+b) cos? 6’ + 2asin* 6’ a+tb+(a—b)sin* 6’ 


and observed that (33.76) would follow. Jacobi provided more details on this trans- 
formation in section 38 of his Fundamenta Nova, published a decade after Gauss’s 
paper. Since Jacobi was pursuing other threads, his presentation is not as direct as 
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that in Cayley’s 1876 treatise. Following Cayley, replace sin 6 and sin’ by y and x, 
respectively, to write Gauss’s substitution as 


_ a+k)x _a-—b 
~~ T+ x2’ ae 


(33.81) 


This is the form in which Gauss’s transformation is often presented, particularly 
in connection with the transformation theory of elliptic functions. One then perceives 
that proving the relation / (a,b) = I (a), b1) is equivalent to showing that 


d 1+kd 
- Sys = (33.82) 
Jd—y)d—A2y2) Jd — x2) — kx?) 
where A = bal 
To prove (33.82), let D = 1 + kx, the denominator of y in (33.81). Then 
py — d= ke) 
= D 
(1+x)(1+ kx) 
1 } y= ; 
D 
1— Vk 
1 = AX = 1a Vk 
D 
1 k 
ieee ree + Vix 
D 
Consequently, 
Y= y= 22%) = 0 = ke?) V0 = 2) = 2D? (33.83) 
L+kK(1 —kx?)d 
ay = Os (33.84) 


D2 

Note that formulas (33.83) and (33.84) imply (33.82), giving us another proof of 
Gauss’s theorem. 

Gauss gave yet another proof, by means of power series, though he did not publish 
it.2! We reproduce Gauss’s proof, but we use subscript and factorial symbols where 
Gauss did not. In this derivation, he assumed that M(1+x, 1—x) had a series expansion 
so that he could write 


1 


= Ae Age Asa Ae ie, ApS: 
MGspe ied. sre ee : 


21 Gauss (1863-1927) vol. 3, pp. 361-403, especially pp. 367-369. 
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Using (33.75), he had 
e +A a om # = = 2t(Ao + Ait* + Aot®? +--+) 
Lee T 1 1+Pf T 2 1+P T = 07 1 T 2, T : 


He equated the coefficients of powers of ft to obtain the relations 


Ao = 1, 
0=1-—4Aj, 

At = 1—12A1 +16Ap, 

0 =1—24A) + 80A — 64A3, 

A> = 1 —40A, + 240A2 — 44843 + 25644, ete. 


Unlike earlier mathematicians, he also presented the general nth relation 


Fee 9 eae gy aia A 
2! 4! 
gg ee Des, 


6 


with the remark that M = O when n was even and M = Ana when n was odd; in 
other words, M was the mth term of the series Ao, Aj, A2,.... Taking 0 = 0 as the 
Oth equation, and abbreviating and labeling the equations as [0], [1], [2], ..., he wrote 


down the equations 
1°[2] — 0°[0], 2713] — 17[1], 37[4] — 27[2], 4715] — 3713], 57[6] — 4°[4], .... 


In general, for the nth equation he had 


3n2— 3n +2 ; jaye Oe 1)(5n2— 5n+ °) 
2! 


n>N—(n—1)°L = (2n v(t Ay ii 


(33.85) 


2 
_Qn—1) (263° + 1)n(n — 1)(n — 2)(7n* — 7n + 12) | -). 


6! 


A editorial note to Gauss’s paper observes that L and N were equal to An-2 and 
An , respectively, when n was even, and zero when n was odd. In another footnote, 
it is pointed out that the derivation connected with the forms n?N — (n — 1)7L was 
explained in article 162 of the Disquisitiones Arithmeticae. It may be of interest to note 
that in that article, Gauss discussed the problem of determining all transformations of 
the form 


X =a'x+ ply, Y=y'x+o'y, 
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given one known transformation 
X=ax+ By, Y=yx+oy 
of 
AX? + 2BXY+CY? to ax? +2bxy + cy’. 


Continuing the proof, let k = 2/ — 1. Gauss wrote the /th term of the expression in 
(33.85) (without the factor 2n — 1) as 


5) (n 51)... (n — 53) (kn? — kn + 1) 


(k — 1)! 


(n 
se ghel 4 


k-1 
2 


(33.86) 


We note that the sign is plus when / — 1 is even and minus when / — 1 is odd. Gauss 
then divided each such term into two parts and added the second part of one term to 
the first part of the succeeding term. As a first step in this process, he observed that 


| k=4 c= k-1)° 
kn? —kn + rl =K(n iG =) +! Bs 


2 2 4 
Using this, he could express the term (33.86) as a sum of two terms 


4 [ ok-1p24 (" | *) (" | fp) (neta) 


k-1 
=2> k! 


k-5 k-3 2: 
kV ay 2 z, 
+( a (k— 2)! 2 


When he added the second part of this expression to the first part of the succeeding 
expression, he obtained 


mG 3) (n 58)... (n- 51) («4 DAL ~ PA). 


Thus, he could express (33.85) as 


n(n — 1) 
213 


n?N —(n—1)°L = (2n—1) (40 27A1)—4 37 A, #A2)) 


poe Ee 2) 
4!5 


+ (2n »( (5° A> Ay)) 


Qn —1) (@ceereee 


a (77-A3 — 6° A2) 4 ). 
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To clarify the general result implied by this procedure, Gauss wrote the first few 
special cases: 


0=1-4A), 
4A; —1=3(1 —4A1) —4(9A1 — 164A), 
0 = 5(1 — 4A1) — 209A, — 16A2) + 16(25A2 — 36A3), 
16A2 — 9A; = 7(1 — 4A1) — 56(9A,1 — 16A2) + 112(25A2 — 36A3) 
— 65(49A3 — 64A4)_ etc. 


Thus, he found that 
12 32 12. 32 12. 32 : 52 
PAS Dp P= gp gecan <2 = aaa oe 


and in general 


fe en? 
"92.42... (2n)? 


Gauss then related 
integral as a series: 


WUEEIon with the elliptic integral by evaluating the following 


a i 2 nae? 9\—4 a i eee 1 34004 
_ (1 — x“ cos* 6)" 2d0 = — 1+ =x*cos*@+—--x"cos'6+--- ) dé 
Tw Jo a Jo 2 2 4 


12 123° 


Pati y (33.87) 


Since the series for the integral and the agM were the same, he concluded that 
1 ae. ‘i do 
Mi+x1—x) awJo J/{ —x2sin26 
Finally, one may see that by taking x = ,/ ty we have Gauss’s formula (33.76). 
In a 1796 paper,” James Ivory gave an interesting new method to prove the formula 


(7 
1+x 


) = (1+x)K(x), (33.88) 


where 


2 [2 do 
K(@) = 
aw Jo V1 —x?cos2 6 


Legendre was the first to prove this result, for the purpose of numerically evaluating 
complete elliptic integrals; his proof used the Landen transformation. In his paper, 


22 Ivory (1796). 
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Ivory did not mention the agM and, indeed, he may not have been aware of its 
significance even if he had noticed it in Lagrange’s 1785 paper. In the cover letter 
accompanying his paper, Ivory explained that his aim was to present a simple method 
for obtaining the expansion 


(a* + b? — 2abcos ¢)" = A+ Bcos¢ +Ccos2o+---. (33.89) 


Ivory started with the relation sin(y — ¢) = csin W, took its fluxion (derivative), 


and simplified to get 
Vl—c2sin? wy —ccosy . 
w. 
V1l—c?sin? y 


He performed an elementary calculation to show that the numerator could be 
expressed as \/1 + c2 — 2c cos @. This led him to the equation 


o= 


p v 


= ; (33.90) 
V14+c2 —2ccos¢ {1 —e2sin? y 
He then set 
1—VJ1-—c? 
(22S (33.91) 
1+v1-—c? 
to find that 
me Sey J/1+c2 + 2c’/cos 2 
l+e' 
Thus, he could express (33.90) in the form 
d ap TOP (33.92) 
Vl+c2—2ccos@ V/1+e2+2c'cos2y 
Ivory’s contribution was a new method for evaluating the integrals 
i d % l+c')d 
/ ¢ =) ees | 5503) 
0 Jl+c2—2ccos@ Jo V1 +c? +2c’cos2v 


First, he observed that by the binomial theorem 
(1 +c? — 2ecos@)~? = (1 — cel#)-2.(1 — ce7#9)-2 


1. 1-3 . 1 . 1-3 ; 
= (1 cel? 4 Pree J (1 Ege 1? 4 Sete tt), 


2, 2 
(33.94) 
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Ivory then noted that multiplying these two series gave him the cosine expansion 
when n = —5 in (33.89) . This showed that the value of the integral on the left-hand 
side of (33.93) was simply the constant A. Since the constant term A was easy to 


evaluate from the product of the two series, he could express equation (33.93) as 


a “ ee ay ee eee 
0 V1+c? —2ccos¢ ae 22.4? 
iG 1ea3° 
= (1+c’) (1 er ). (33.95) 


Observe that this proves (33.88). 

Ivory remarked that c’ was smaller than c; as an example, he pointed out that 
when c= 3, then c’ = i Thus, (33.95) was very useful for computational purposes. 
Ivory noted, as Legendre had done before him, that the formula’s computational 
effectiveness was greatly improved by iteration. Now note that (33.95) can be 


expressed as a quadratic transformation of hypergeometric functions: 


11 11 
F( =,=,1,c?)=(Q4+c)F( =,=,1,c? 
(55 2) d+c) (55 Cc ) 


where c’ is given by (33.91). Gauss may have been motivated to study quadratic 
transformations of hypergeometric functions because of this and similar results. 

We have discussed two different quadratic transformations of elliptic integrals. We 
can rewrite Landen’s transformation (33.33) in the form: If 


_2vk 
eee 


(I — y?) 
(1 — 22 y?) 


M47 =1, z= (14d)y 


then 


(1+k)dz _ 2dy 
Vd-2)0-P2) Jd—y)0— &y?) 


(33.96) 


Note: (33.96) gives one quadratic transformation and the other is Gauss’s transfor- 
mation given by the equations (33.81) and (33.82). It is easy to check that if these 
transformations are applied one after the other, we get the duplication of the elliptic 
integral: 


dz S 2dx 
VO=2O=PP). VA=27)CSEP). 


V1 — x2 JS 1 — k2x2 
i‘ oe 


Z=2 
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Gauss wrote in an 1816 letter to his friend Schumacher that he rediscovered the 
arithmetic-geometric mean in 1791 at the age of 14.23 From 1791 until 1800, Gauss 
made a series of discoveries advancing the theory of elliptic integrals and functions 
to new and extraordinary heights. Among his great achievements in this area were the 
inversion of the elliptic integral and the consequent discovery of double periodicity 
and the development of elliptic functions as power series and as double products, 
leading to series and product expansions in terms of trigonometric functions. This in 
turn brought him to the discovery of the theta functions and the triple product identity. 
The initial motivation behind Gauss’s work on elliptic functions was the problem 
of the division of the lemniscate. Gauss solved the problem by means of complex 
multiplication of elliptic functions. Finally, in 1800, he extended his youthful work 
on the arithmetic-geometric mean by considering the agM of two complex numbers. 
He found that the agM in this case was countably many-valued, and his attempts to 
find a relation among the values led him deeper into the theory of theta and modular 
functions. In this connection, Gauss discovered an important transformation of theta 
functions: 


lore) - = one) ey 
n~ : 
) eo uk +w) are i ) eo a theni 
a 


k=—00 k=—00 


It is remarkable that Gauss published very little of these groundbreaking theories, 
although they surely rank among his greatest discoveries in pure mathematics. Perhaps 
he wished to first develop a coherent theory of functions of complex variables. 
Consider the fact that, though he initially discovered double periodicity through a 
formal use of complex numbers, his 1800 work defined an elliptic function by means 
of a ratio of two theta functions. His early definition of an elliptic function by means 
of the inversion of an elliptic integral would require the concept and careful use 
of analytic continuation, not then developed. We note that Gauss’s 1811 letter to 
Bessel shows that he was making inroads into the mysteries of complex variables. 
The unpublished portion of Gauss’s 1813 paper on hypergeometric series also gives 
some indication of his understanding of analytic continuation. However, it seems that 
after 1805-06, Gauss never found the time to completely develop his ideas in number 
theory, elliptic functions, or complex variables. From 1801 onward, he researched 
applied topics such as astronomy, geodesy, telegraphy, magnetism, crystallography, 
and optics. Of course, mathematical problems in these areas led him to interesting and 
important discoveries such as the method of least squares, trigonometric interpolation, 
numerical integration, the technique of fast Fourier transforms, and the theory of 
curved surfaces. 


23 Peters (1860-1865) vol. 1, p. 125. 
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Thus, Gauss never wrote up a detailed account of his researches on elliptic 
functions, though he wrote a substantial amount on theta functions. In his fragmentary 
notes on elliptic functions, one may see some of the main results but usually there are 
no details of the methods he employed. However, in a letter to Schumacher,”+ Gauss 
wrote that Abel’s first paper on the theory of elliptic functions followed the same 
path he himself had trod in 1798 and that Abel’s work relieved him of the burden of 
publishing that part of his work. He wrote a similar letter to Crelle and the entries in 
Gauss’s diary from the period 1797-1800 bear out these assertions. 

Gauss’s investigations relating to elliptic functions, the agM, and theta functions 
took place during 1791-1800, the critical period when he was maturing into a most 
formidable mathematical mind. Gauss’s early interest in mathematics was kindled by 
his association with Johann Bartels (1769-1836), a teacher’s assistant at the school 
Gauss attended. Gauss was 11 years old when he and Bartels studied infinite series and 
the binomial theorem. Bartels later became professor of mathematics at the University 
of Kazan where he taught the great Russian mathematician N. I. Lobachevsky (1793-— 
1856), one of the discoverers of non-Euclidean geometry. Having become known as a 
promising student, in 1791 Gauss met the Duke of Braunschweig and was presented 
with a table of logarithms by the minister of state. Greatly impressed with the genius 
of Gauss, the Duke provided financial support for Gauss to attend the Collegium 
Carolinum in Braunschweig (Brunswick). Entering the Collegium in 1792, Gauss 
became accomplished in languages and began studying the works of Newton, Euler, 
and Lagrange. He was impressed by Euler’s pentagonal number theorem: 


love) love) ane 
| [a-2 = be Gis 2 
n=1 


n=—-® 


This result led him to investigate series whose exponents were square or triangular 
numbers. Gauss is reported to have said that in 1794 he knew the connection between 
such series and the agM. This is very likely because the series identities needed for 
this question could be easily proved by methods Gauss had seen in the works of Euler. 
To state the identities, set A(x) = yx” and B(x) = yi(-1)"x™ where the sums 
are over all integers. Then 


A(x) + B(x) = 2A(x4), (33.97) 
A?(x) + B*(x) = 2A7(x”), (33.98) 
A(x) B(x) = B?(x?). (33.99) 


The first identity, (33.97), is almost obvious; the second, (33.98), can be proved 
by first observing that the coefficient of x” in A?(x) is the number of ways n can be 
expressed as a sum of two integer squares and then noting that this is the same as 
the number of ways 2n can be expressed as a sum of two integer squares. The third 
identity, (33.99), is a consequence of the first two because 


24 Gauss (1863-1927) vol. 101, p. 248. 
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(A(x) + B(x)? A?) + B*(x) 
2 2 


= 2A7(x4) — A? (x?) = B?(x?). 


A(x) B(x) = 


It follows from (33.97 ) and (33.99) that the arithmetic mean and geometric mean of 
A?(x) and B?(x) are A?(x*) and B*(x”). This kind of reasoning must have been very 
familiar to Gauss in 1794, both from his investigations in collaboration with Bartels 
and from his study of Euler’s Introductio. Gauss wrote up his results connecting series 
with the agM at a later date.”> In that manuscript, he derived the properties of the series 
by means of their product representations obtained from the triple product identity. 

In 1794-1795, Gauss does not seem to have been aware of the connection of 
these series or the agM with elliptic integrals. Then in October 1795, with the 
continued support of the Duke, Gauss registered as a student at the University of 
Gottingen where he had access to an excellent library. For example, in early 1796 he 
borrowed many volumes of the Mémories de l’Académie de Berlin from the library. 
The volumes contained several works of Lagrange on number theory, algebra, and 
other mathematical topics. Gauss first mentioned elliptic integrals in his mathematical 
diary on September 9, 1796.7° He gave the power series expansion of the inverse 
of the elliptic integral f(1 — x3)-3 dx; he found it by Newton’s reversion of series 
method. A few days later, he noted the series for the inverse of the more general 
integral f(1 — x")-3 dx. In January 1797, his interest in elliptic integrals became 
more serious. His notes indicate that he had already studied Stirling and Euler on this 
topic. He noted in his January 7 entry in the diary that 


[ Vsnxax = 2 vy dy JVaeee2{ yy = ny, 
Jive sin x Naw cos 

and a day later he recorded that he had started investigating the elastic curve depending 

on f(1 — x4)-3 dx. Later he crossed out the words “elastic curve” and replaced them 

with “lemniscate.” In order to understand the reason for this change in point of view, 

we note that Gauss started his mathematical diary in March 1796 when he discovered 

the principles underlying the problem of dividing the circle into n equal parts. In 


particular, this problem required a study of the polynomials obtained when sin(nx) 
and cos(nx) were expressed in terms of sinx and cos x. Note ue the sine or cosine 


1797, Gauss found that in order to divide the lemniscate into n eal parts, he had to 
study the properties of the lemniscatic function, defined as the inverse of the integral 


dx 
f J 1—x4 , 
On March 19, Gauss observed in his diary that the division of the lemniscate into n 
parts led to an algebraic equation of degree n7. In fact, this follows from the addition 


formula for the lemniscatic integral. However, it appears from his September 1796 


25 ibid. vol. 3, pp. 466-469. 
2© See Dunnington (2004) pp. 469-484. 
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If 


note that he was already thinking in terms of the inverse of the integral [ a. 
we denote the inverse by sl x, then we see that he had discovered that sl(nx) could 
be expressed as a rational function of sl x and that the numerator was a polynomial 
of degree n? in sl x. Since only n solutions correspond to real division points, this 
discovery showed him that a majority of the solutions of the equation of degree n? had 
to be complex. It is possible that this led him to make an imaginary substitution in the 


integral [ = => and this in turn led him to the discovery of the double periodicity of 
—x 


the lemniscatic function. In his undated diary entry between March 19 and March 21, 
Gauss noted that the lemniscate was geometrically divisible into five equal parts. This 
is a remarkable statement; it shows that Gauss had not only found double periodicity 
but had also found an example of complex multiplication of elliptic functions. We note 
that in general, an elliptic function ¢ (x) has two fundamental periods whose ratio must 
be a complex number. Moreover, for any integer n, the addition formula for elliptic 
functions shows that @(nx) is a rational function of # (x). However, in the case where 
the ratio of the fundamental periods is a root of a quadratic with rational coefficients, 
there exists a complex number a@ such that ¢(ax) can be expressed as a rational 
function of ¢ (x). In this situation, we say that (x) permits complex multiplication by 
a. Apparently, in 1828 Abel was the first to study this phenomenon. It is not clear to 
what extent Gauss had investigated complex multiplication, but he certainly used it in 
connection with dividing the lemniscate into n equal parts, at least when n = 5. To be 
able to show that a fifth part of the lemniscatic curve could be obtained by geometric 
construction, Gauss had to solve two appropriate quadratic equations. The surviving 
fragments of Gauss’s work on this problem do not contain these equations, but they 
can be found in Abel’s first paper on elliptic functions.”’ 
Abel noted that 5 = (2+ i)(2 — 1), and that for y = ¢(x) = sl(x), 


L374 
2+i)x) = yi =z, 33.100 
o((2 + i)x) IT any4 z ( ) 
es es 


#52) = 62-2 +) =-a Ta 


We note that Abel here used complex multiplication of @ by 2+. Next, using (33.101 
to solve the equation @(5x) = 0, he had to first solve z* = 1+ 2i, and then solve 


(33.101) 


SEY a foi (33.102) 
Taye re 


Solve the latter by dividing the previous equation by the conjugate equation 


bo 1 
=(1—2i)4 33.103 
wT Gon ote 


27 Abel (2007) p. 248. 
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to obtain a quadratic in y+. Note that all the equations can be solved by appropriate 
quadratic equations. 

In his notes from this period,?® Gauss defined the lemniscatic sine and cosine 
functions by the equations 


. * dt 1 * dt 
sin lemn / ——— ] =x, coslemn{ ~—w ; =X; 
0 V¥1—f4 2 0 V¥1—f4 


where w = fe ad - 14)-3 dt. Gauss sometimes abbreviated sine lemn and cos lemn as 
s and c. We use the more common sl and cl. By an application of Euler’s addition 
formula for elliptic integrals, Gauss found the addition formulas for the elliptic 
functions sl] and cl: 


l=ss+cc+sscc, (33.104) 
if: / 
dense (33.105) 
1+ ses'c! 
cc’ = ss’ 
l(a + b) = ———— l 
cl(a + b) TE sstec!? (33.106) 


where s = sl(a),s’ = sl(b) and c,c’ are similarly defined. He employed these formulas 
to express sl(n@) and cl(n@) in terms of sl(#) and cl(¢). By a formal change of 
variables f = iu, Gauss obtained 


x 1 ix \ 
a) (a7) a= | (1 —u*)~2 du, 
0 0 


or, in terms of the lemniscatic functions: 


1 
sl(iy) = isl(y),  cl(@iy) = Gy’ 


Thus, with (33.105), he had the formula for complex arguments 


eee sl(a) + isl(b)cl(a)cl(b) 
eee tb) = i BALD. 


(33.107) 


Gauss used these formulas to determine that the periods of sl were 2w and 2iw. 
The ratio of the periods would then be i = /—1, a root of the quadratic equation 
x? + 1 = 0, so that complex multiplication by /—I was possible. Gauss also found 
that the zeros and poles of sl(@) were of the form (m+in)w and ((m+ 5) +i(n+ 5))o, 
where m,n € Z. These results allowed him to express the lemniscatic function as a 
quotient of two entire functions 


sl(d) = an?) (33.108) 


N(o)’ 


28 Gauss (1863-1927) vol. 101, pp. 147-154. 
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where M and N were double infinite products. Gauss’s diary entry, for March 29, 
1797, confirms that by that time he was aware of all these results. In the same entry, 
he gave numerical evidence that log N(w) agreed with + to six decimal places and 
noted that a proof would be an important advance in analysis. Gauss’s fragmentary 
notes from this period also show that he had found the significant formula 


M()* + N(¢)* = N24). (33.109) 


He also wrote, perhaps based on numerical evidence, that 


@ (20) 


Subsequent entries in Gauss’s diary suggest that he abandoned his intensive study 
of elliptic functions for a year. The ninety-second entry, written July 1798, concerned 
the lemniscatic function; he noted that he had “found out the most elegant things 
exceeding all expectations and that by methods which open to us a whole new field 
ahead.” According to his notes, his result was?” 


sl(¢) = A) (33.111) 


Od) 


Zz _@ (1 : Ass fh Ass ia Ass 
) 7 a (e7 =) ( ' (e27 _ <=) ( : (e37 =) ana 


(33.112) 


( = J (33.113) 
(ef et)? 


From Abel, we may surmise that Gauss used the product formula for sinx to 
transform the products M and N into new products expressed in terms of the variable 
s= sin(22). 

Gauss also gave the equations connecting M and N with P and Q: 


M(Wo) =e?" P(Wo), N(bo) =e?" O(a), (33.114) 


particular cases of Weierstrass’s relations connecting his sigma function with the theta 

function. Note that when y = 1,s = sin(a7w) = sina = 0 and Q(w) = 1. Therefore, 
rf 

N(@) = e72; thus, Gauss resolved the questions he raised the year before. 


29 ibid. vol. 3, pp. 415-416. 
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In the summer of 1798, Gauss discovered another important set of relations, the 
significance of which is better understood by observing that for s = sin yr, 
As? As*e— 2K 


14 ( — e72nmy2 


| _— 
(ert — ennTy2 5 


(1 — 2e-2"* cos 2m + e~*"7) 
= qd —- e—2nm)y2 
( _ | erat an dae _ ent e—tivr) 


a el ies e72nm)2 


’ 


an equation that converts the previous product for P (33.112) to 


00 _ 4 —2nn ,2iva _ p—2nn ,—2iwn 
o., (l—-e e yd-e e ) 
P(r) = — sin ll asa (33.115) 
Similarly, the product for Q, as in (33.113), can be rewritten as 
oo —(2n—1)r ,2iwn —(2n—1)x ,-2ipa 
(l+e~$ eV™(1+e e ) 
Ovo) =| aos (33.116) 
eel (l—e ) 
From these results, Gauss found the Fourier series expansion of sl(¢):°° 
4 4 
slo) = ~——— sin yx — >" sin3yr+---. 33.117) 
Wez+e 2 Wer +e72 


He also found Fourier series for log O(Ww), log P(Ww), logsl(Wwa), etc. For 
example, 


1 ore 2 1 
log O(Wa) = 5 log2 1D gaa cos 2a — jee cos4ymr+-.+. 
(33.118) 


Note that products (33.115) and (33.116) are theta products revealing the form of 
the product representation for a general theta function. 

It appears that Gauss had reached this point in his researches on the lemniscatic 
function by the end of summer 1798. And on September 28, he completed his studies 
at Gottingen and departed for Braunschweig. From a letter to his great friend Wolfgang 
Bolyai, we learn that Gauss was uncertain of his financial future. Note that Bolyai’s 
son was the noted Janos Bolyai, discoverer of non-Euclidean geometry. Gauss’s 
financial uncertainty remained until the end of the year when the Duke guaranteed 
him further support, suggesting that Gauss earn a doctoral degree in mathematics. 
Gauss accomplished this by submitting a thesis to the University of Helmstedt on 
the fundamental theorem of algebra, work he had completed a year earlier. He noted 


30 ibid. p. 417. 
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in his diary, “Proved by a valid method that equations have imaginary roots.” In a 
later addendum to this diary entry, from October 1798, Gauss wrote that this method 
was published August 1799 as his dissertation; he received his degree in July 1799 
on the recommendation of Johann Friedrich Pfaff (1765-1825). Gauss was then free 
to continue his mathematical researches with the Duke’s financial assistance. This 
increased after Gauss turned down an offer from St. Petersburg in 1802 and continued 
until he was appointed director of the observatory at the University of Gottingen where 
he remained to the end of his life. 

In spite of his financial insecurity during the fall of 1798, Gauss’s creativity did 
not abate. In October 1798, Gauss noted, “New things in the field of analysis opened 
up to us, namely, investigation of a function etc.” Gauss had earlier found the Fourier 


expansion of a but he was excited now to discover the Fourier expansions of the 


functions P and Q themselves:*! 


P(wo) = 2%, /2(e-3 sina — e~ * sin3va te sinSwa —---) (33.119) 
0) 
and 
-1 [Ht 1 —4n 
O(ww) =2°>4,/—(1 + 2e ” cos2wa + 2e cos4wa +---). (33.120) 
>) 


As consequences, he noted 


Dep De 4 = 909 ee |S (33.121) 
IU 


514 On 252 1 (00) 
prada eae eee (33.122) 
2V x 


2 = 0.91357913815611682140724259. 
uA 


He found this value by computing 2e~ 4 to thirty-nine decimal places. Note that 
(33.121) in fact gives the period of the lemniscatic function as a theta series value. 

By comparing the products for P and Q in (33.115) and (33.116) with the series for 
P and Q in (33.119) and (33.120), we see that by 1798 Gauss knew the triple product 
identity. In fact, to derive the series from the product, one requires not only the triple 
product identity but also an additional formula. Gauss could have derived this from 
what he already knew. First consider how the factor e~ 4 arises in the series for P. 
The addition formula (33.107) implies Gauss’s observation that 


Cr a ae —i 
sl (ov oe iS) Soa (33.123) 


31 ibid. vol. 10), pp. 536-537. 
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247 
Here we mention that Jacobi used a similar formula in his Fundamenta Nova. We 
also note that 


im(W+5t5) — jo-F iva 


When these two relations are applied to the formulas (33.111), (33.115), and (33.116) 
the result after a simple calculation for g = e~” is that 


fits) -2f (22) 
m1} — 7) 
3) : 1+q-" ears 1—q" 
This simplifies to 
De Pages 
ae = I] ( LL 2n ) 
n=1 q 
When the value 4 


from this equation is substituted in (33.115), we arrive at 


_ p—2nt e2ivn 
P(wo) = 2-4 sinya J] $ Ses MS 


n=1 


e2nk 9 2ivn) 
el +e- (Qn— Dm)2 


14 


Here observe that the factor e~ 4 is accounted for. After applying the triple product 
identity to this equation, we obtain the series 


P(vo) = = at le sin(2n + l)y 


TI — 2") (1 + e- On-Dr) 


(33.124) 
Since n(n + 1) + 5 


(2n + 1)?, we see that this is Gauss’s series except for the 
infinite product in the denominator. To eliminate this term, we apply Gauss’s relations 
(33.109), (33.110), and (33.114) to get 


rg)=a(s)=" 


Setting y = 5 and using the last relation in (33.115) and (33.116), we find 


2 eae d + e722 
Tae 1] 4 p-Qn—)r)\2’ 
oo rae + een T) 


Sal _ oo ( age rye 
= I] Ce e~ 2n-1)ry2° 
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From these two equations, a few lines of calculation yield 


24/2 = Tl 
rt a d— e7 2am) (1 4 e—(2n—I)r)" 


Ultimately, we obtain Gauss’s series (33.119) when we apply this equation to 
(33.124). Obtain series (33.120) similarly. 

Although the triple product identity is difficult to prove ab initio, Gauss gave at 
least two documented proofs of this kind, thought to date from approximately 1808. 
However, it is possible that Gauss could have proved the triple product identity in 1798 
by assuming Euler’s pentagonal number theorem and in that case the proof would 
have been straightforward, the necessary technique having been established by Euler. 
Consider the product in the numerator of Q(ywq) in (33.116). For convenience, set 
gq =e-7 and x = e*'¥7, so that the product becomes 


oo ne g?n-l 
a Oe o(i+ 7 ) 


Then 


2 a 2n+1 a 
fagis)=|]a+¢ o(1+ 5 ) 


= — f(x) = — f(a). 
1+qx qx 


Now let f (x) = )°° anx" so the previous equation becomes 


[o,2) [oe] [oe] 
>» nx" = qXx yt agen” = ye egg, 
n=—CO n=—-CO n=—CO 
By equating the coefficients of x”, we see that 
a = ij 2 
an(q) = an—1(q) qr”! = an—2(q)q?”'1?"9 = ao(q)q” . 
Hence 
00 gu! 00 : 
[|e +@rt(1 + ) =ao(q) Y> q” x". (33.125) 
n=1 - n=—o 


These simple calculations appear in Gauss’s notes of 1799. Now to employ Euler’s 
pentagonal number theorem 


(oe) CO 

n(3n+1) 
[]Ja-e >= 4) corp, (33.126) 
n=1 


n=—-C 
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we set g = p? and x = _p? in (33.125) to get 


n(n+l) 
20, 


| Ja - pd - vp?) = 40g) D5 C1)" (33.127) 
n=1 


n=—-C 
Comparing (33.126) and (33.127), we arrive at 


1 1 
Md -p*) Tread-4@™) 


and this proves the triple product identity. 

After his remarkable work of 1798, Gauss’s journal entry of May 30, 1799, 
connected the agM with the lemniscatic integral: “We have proved that the arithmetic- 
geometric mean of 1 and V2 is = to 11 places, which thing being proved a new field 
will certainly be opened up.” 

We derive the agM of /2 and 1 from some previously mentioned formulas of 
Gauss. By taking y = 1 in (33.120), we get 


ag(q) = 


In terms of the functions A and B in (33.97) and (33.99), the previous formula and 
(33.121) imply that 


A2(e-®) =23— and B%(e77) =". 
IT TT 


When this is combined with the fact that A?(x”) and B?(x?) are the arithmetic 
and geometric means, respectively, of A?(x) and B*(x), it follows that the agM of 
J/2 and 1 is This means that, if in 1795 Gauss knew the connection of the two 


series > x” and Y(-1)"x™ with the agM, then in 1799 he had a proof of the result 
quoted from the diary. Since he enjoyed numerical computation, he also verified this 
result to eleven places. Felix Klein and Ludwig Schlesinger, the editors of Gauss’s 
mathematical diary, have remarked that the May 30 entry could represent a conclusion 
or a conjecture. It is very likely that it was a conclusion and that when he spoke of a 
new field, Gauss had in mind a generalization to any two real numbers a and b instead 
of the pair 1, 2. As we have seen, Lagrange had already found this generalization in 
1785. Gauss published his work in an astronomical paper of 1818,°* where he wrote 
that he discovered the result before he saw the paper of Lagrange. 

It appears that up to 1798, Gauss did not investigate elliptic functions beyond the 
lemniscatic function, but with his discovery of the connection between the agM and 


the elliptic integral 


, he began to explore the inversion of more 
a2 cos? 0+b? sin? 6 


32 ibid. vol. 3, pp. 331-355. 


250 Elliptic Functions: Eighteenth Century 


general elliptic integrals. This culminated with his journal entry of May 6, 1800: “We 
have led the theory of transcendental quantities: 


dx 
JA —axx) — Bxx) 


to the summit of universality.” Gauss’s notes show that he used the agM to define 
two theta functions whose ratio he demonstrated to be the elliptic function inverting 
the integral. Gauss’s approach to elliptic functions as ratios of theta functions was the 
same as the point of view taken by Jacobi in his 1836 K6nigsberg lectures. Gauss 
started with the integral? 


if du a oy 
Vat pusineuy 


and set 


IT JT COS U IT J COS U 


— = 0, — z 
M/A+pun) Mecosv uM /(1+ a) M sinv 


p= tanv, ! 


Note that Gauss denoted the agM of ,/(1 + 1”) and 1 by 
M(/(1+ 1°), 1) = MJ/(1 + wy), 
so that 
Mcosv=M(1,cosv) and Msinv = M(1, sinv). 


Note also that the Lagrange—Gauss agM theorem implied that 


o= du -[ dx 
2 Jo Jd+p2sin2u) Jo MCU — x2) + p2x2))’ 


ol oe du _ ae dx 
2 feo xf eS Siw). hale J = x0 + bx?) 


my 


Gauss then wrote the elliptic function as 


a 4sinwa 4sin3ya 
S(wo) = o! of 3a! 300! ra 
Mo em” +e Jo ew" +e Jeo 
_ The) 
Wyo)’ 


33 ibid. vol. 10), pp. 194-198. 
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where the theta functions T and W were defined by the series 


wo! 1 


W(Wo) = /M cos v(1 + 2e~“o cos Qwat+ ea cos4wa+-:-:-), 
T(Wo) = J cot v./M cos v(2e~ % sina — 2e~ fo" sin 3a +---). 


To demonstrate that his elliptic function was actually the inversion of his original 
elliptic integral, he effectively showed that if 


[ dt = 
0 J+ pmusin?t) 


then s(@) = sinu. Without giving details, he next wrote down the zeros of W and T 
and extended to this elliptic function all the results he had obtained for the lemniscatic 
function. 


33.7 Exercises 


(1) Show that if r = © = then 
+u 


| a vit® _ fa, vit@ _avi-# , avi-# 
JI—-? Sa. eee ae 
See Fagnano (1911) vol. 2, p. 453. 
(2) Show that the complete integral of Ta = Ta racy is given by 
2222 
forty) + 2 — gery ty) —2fxy — 27a +y) —2fe =0. 
Ay: 


See Eu. I-20 p. 78. 


(3) Let (x,y) = []p2,. + x7"! y)(1 + 
that 


xen! 


) and [x] = [](2, (1 — x”). Show 


[ee 
2! 


(x,y): (« *) = (7,07) - (7, y?) + xary(x?, 07x)? x7") 
See Gauss (1863-1927) vol. 3, p. 458. 
(4) Let 
2n—-1 


[asertp(i+4 Jor de O™ +7), 


y m=—CO 


n=1 


and let [x] be defined as in the problem above. Show that 
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(a) 
Pres ars Sear a ing od beeen © Sl I 
1—2x +2x4—2x9 +... [x2]? 1—2x +2x4—2x9 4...’ 
(b) 
eg OO ah 1 
1 —2x44+ 2x16 —27364... [x2][x8] 1—2x442x16-... 
(c) 
[x7] Fx = [x8] Fxt = [x1 F x! = [x 8] Fx™ = etc. = 1, 
(d) 
elie Ba tS 
(e) 
(oy yt [x2] = a ae oak oe Gage 
[xPIx4Ph l—-x 14x? 1-x3 


See Gauss (1863-1927) vol. 3, pp. 446-447. Observe that this was one of 
Gauss’s proofs of the triple product identity. 


(5) Set 


Qn—1)? 


CO CO CO 
Px=1425°x", Ox=142\(-D"x", Rx=2>ox° 79 
n=1 n=1 n=1 


Note that we would write Px = P(x), etc. Show that 


aes Veal 
(a) Rx = 2x4 ey” 


(b) Px. Ox =(Oxx)?) Px Rx = BY 

(c) Px+Ox =2P(x*); Px—Ox =2R(x*);  (Px)?—(Ox)* = 2(Rxx)?, 
(d) Px +iOx =(1+i)Q(ix); Px —iQx = (1—i)P(ix), 

(e) (Px)? + (Ox)? = 2(Pxx)*; (Px)* — (Qx)* = (Rx)’, 

(f) (Px)*, (Ox)* have an arithmetic geometric mean that is always 1, 


(g) 


20 de 
/ = 2n 
0 J/((Px)4 cos? 6 + (Qx)4 sin’ 6) 


We note that Gauss wrote cos 6? for cos” 0, etc. See Gauss (1863-1927) vol. 3, 
pp. 465-467. 
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(6) Show that 


8x l6xx  24x° 32x 
f=x% “1 + XX ' 1x3 
8x 8xx 8x3 8x4 


hd= 2 (eee Ge? dae? 


=1 


See Gauss (1863-1927) vol. 3, p. 445. 
(7) Show that 


where 


O_ | de 
2 Jo Jz) 
See Gauss (1863-1927) vol. 3, p. 425. 
(8) This exercise gives a proof of the transformation formula for a theta function, 


first published by Cauchy and Poisson. See Section 34.11. Gauss worked out 
the details given here in an unpublished paper. 


(a) Expand T = oy 4 e~«(k+0)” as a Fourier series 


(oe) 


1 
T = Ay +2) ° An cosnoP, where Ay =) TcosnwP dw and P = 27. 


n=1 0 


(b) Show that 


i nazi a 
Ay, = e @cosnwPdw=e « ./—. 
83 a 


(c) Conclude that 


2 TT i\2 
um aol 
, eo akk+o) | £ tO® , » eo a ATS) ; 
a 


k=—0co k=—00 


See Gauss (1863-1927) vol. 3, pp. 436-437. 


33.8 Notes on the Literature 


Fagnano (1911) is a reprint of his Produzioni Matematiche. See pp. 293-297, 304-313 
of vol. 2 for material on the lemniscate. This volume contains several more articles by 
Fagnano on the integral calculus and on the lemniscatic calculus. 

Volumes 20-21 of series 1 of Euler’s Opera Omnia contain the papers of Euler 
providing the foundation for the theory of algebraic functions and their integrals. 
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Of course, Euler dealt only with the integrals arising from the algebraic equation 
y? = p(x), where p(x) was of degree 4. 

See Landen (1771) and (1775) for his original contributions to elliptic integrals. Our 
exposition is based largely on Cayley (1895), pp. 327-330. Cayley’s book contains a 
good account of Jacobi’s Fundamenta Nova, elaborating the transformation theory of 
elliptic integrals of Landen, Legendre, and Gauss. 

Gauss’s extensive work on the agM and his work on elliptic functions in general 
can be found in Gauss (1863-1927), vols. 3 and 10. Pieper (1998) suggests that 
Gauss discovered the triple product identity between April and June 1800. He has 
also pointed out that this identity can be proved easily by applying Euler’s pentagonal 
number theorem. Berggren, Borwein, and Borwein (1997) contains a number of 
interesting papers on the agM and its application to the computation of z. 

There are several interesting historical accounts of the theory of elliptic functions 
and integrals. Weil (1983) deals with Euler’s work on this topic and its relation to 
Diophantine equations. Varadarajan (2006) gives a brief analysis of Euler in terms of 
Riemann surfaces of genus one. Watson (1933) gives a very entertaining and detailed 
mathematical exposition of Fagnano, Landen, and Ivory’s contributions to elliptic 
integrals. Watson also wrote, without giving a reference, that Jacobi called December 
23, 1751, the birthday of elliptic functions. Later, André Weil observed, “According to 
Jacobi, the theory of elliptic functions was born between the twenty-third of December 
1751, and the twenty-seventh of January 1752.” See Weil (1983) p. 1. 

Ozhigova (2007), first published in 1988, (also reprinted in Bogolyubov, Mikhailov, 
and Yushkevich (2007)) refers to Jacobi ’s 1847 letter to Fuss, saying on p. 55 that 
Euler’s study of Fagnano inaugurated the subject of elliptic functions. We observe 
that in his letter, of October 24, 1847, to Euler’s great-grandson P. H. Fuss, Jacobi 
strongly recommended the publication of Euler’s papers, arguing that they were very 
important to the advancement of science. As further support for his point, Jacobi 
mentioned that by reading of the minutes of the Berlin Academy, he discovered a 
critical date in the history of mathematics: when the Academy assigned Euler the 
task of refereeing Count Fagnano’s mathematical work. Jacobi then stated that Euler’s 
evaluation of these papers served to found the theory of elliptic functions. See Stackel 
and Ahrens (1908) p. 23. 

Cox (1984) contains a fascinating and enlightening resumé of Gauss’s remarkable 
work on the agM of two complex numbers. He shows that Gauss may have had sig- 
nificant ideas on the modular group and some of its subgroups and their fundamental 
domains. The reader may wish to read this paper before reading Gauss’s somewhat 
fragmentary original papers on the topic. Mittag-Leffler (1923) and Almkvist and 
Berndt (1988) are both interesting papers. The first is an insightful account of work 
on elliptic functions and integrals from 1718 to 1870; the second focuses on topics 
related to the quadratic transformation and the agM. Mittag-Leffler (1923) is in fact 
an English translation of an 1876 paper published in Swedish. 

The first chapter of Siegel (1969) vol. 1 contains perceptive remarks concerning 
Fagnano and Euler on the addition formula. Biihler (1981) and Dunnington (2004) are 
well-written biographies of Gauss. Biihler has more mathematical exposition, but the 
value of Dunnington is enhanced by the inclusion of an English translation with com- 
mentary by J. J. Gray of Gauss’s diary; we have made use of this translation in the text. 


34 


Elliptic Functions: Nineteenth Century 


34.1 Preliminary Remarks 


The eighteenth century saw two major new results in elliptic functions: the addition 
formula of Euler and the second-order transformation of Landen and Lagrange. Gauss 
discovered yet another quadratic transformation, in connection with his proof that 
the agM of two positive numbers could be represented by an elliptic integral. These 
transformations changed the parameters in the elliptic integrals, without alteringing 
their basic form. In fact, Gauss went well beyond this elementary transformation 
theory, and before the end of the eighteenth century he had greatly refined elliptic 
function theory. He did not publish his work; it was rediscovered by Abel and Jacobi 
in the 1820s. 

Adrien-Marie Legendre was the main contributor to elliptic integrals in the period 
between Lagrange and Abel. In his first major work, KERCICES de calcul intégral of 
1811-1817, he reduced any elliptic integral f A(x) Tea where R(x) was a fourth- 
degree polynomial in x and A(x) was a rational function in x and R(x), to integrals 
of three kinds:! 


x t 
F(k,x) = ; 34.1 
os [ Vd =—?12)(1 — Kt?) oo 
/1 _ 2 
E(k,x) = i —— ag dt, (34.2) 
Jl —?? 
i@eexys [ a (34.3) 
n,k,x) = : r 
0 (1+nx2)/( — x2)(1 — k2x?) 


Legendre’s second major work was the three-volume Traité des fonctions elliptiques 
of 1825-1828;* the first volume presented the received eighteenth-century theory of 
elliptic integrals with some improvements and additions; the second volume gave 


1 Legendre (1811-1817) vol. 1, p. 19. 
2 Legendre (1825-1828). 
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extensive and long-useful numerical tables, constructed by Legendre’s own methods. 
At the age of seventy-five, upon learning of the more advanced results of Abel and 
Jacobi, Legendre did his best to give a flattering exposition of their work, and this was 
the topic of the third volume. 

Legendre (1752-1833) studied at the College Mazarin in Paris, where he received 
an excellent education. Apparently, Legendre wished to be remembered for his works 
alone and not much is known of his personal life. In fact, it was only recently 
discovered that the portrait by which he had been identified for a century was 
actually that of an unrelated politician named Louis Legendre. Thus, the only portrait 
now available is a sketched caricature made by Julien-Léopold Boilly. Legendre’s 
research on his two favorite subjects, number theory and elliptic functions, was 
immediately superseded after his books appeared. Nevertheless, Legendre’s name 
became permanently associated with several mathematical objects, including the 
Legendre polynomials, the Legendre symbol, and the Legendre differential equation. 
Though he studied elliptic functions for almost forty years, Legendre apparently never 
considered inverting the integral. Abel was the first to publish this idea, inaugurating 
a great advance in this topic. 

The mathematical career of Niels Henrik Abel began in 1821 with his attempt to 
solve the general quintic equation. His mathematics professors at the University of 
Christiania could find no errors in Abel’s solution and communicated it to Ferdinand 
Degen in Copenhagen. Though Degen could not find the mistake, he made two 
suggestions: that Abel apply his method to specific examples, since that could reveal 
hidden errors; and that he abandon the sterile subject of algebraic equations to exercise 
his brilliance in the more fruitful subject of elliptic integrals. Degen’s advice led Abel 
to find the mistake in his work and eventually to prove the impossibility of solving the 
quintic in radicals. He also began to work on elliptic integrals, and it is fairly certain 
that by 1823 he had inverted the elliptic integral to rediscover elliptic functions.? We 
recall that Gauss had already done this without publishing it. Moreover, the problem 
of the division of elliptic functions carried Abel deeper into the theory of algebraic 
equations and ultimately to his famous theorem on solvable equations. In this manner, 
Abel found an extremely productive connection between elliptic functions and the 
theory of algebraic equations. 

Abel’s first work on elliptic functions, the first part of “Recherches sur les fonctions 
elliptiques,” appeared in Crelle’s Journal in September 1827.* In this paper he defined 
the elliptic function ¢@a = x when 


i dt 
a= “ 
0 Jd —c2t)(1 + e222) 


He showed that @a was a meromorphic function with two independent periods, 2w 
and 2iw, given by 


(34.4) 


3 Abel (1965) vol. 2, p. 254. 
4 For an English translation of this paper, see Abel (2007) pp. 145-245. 
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c dx q Xx 
=2 . @=2 . (34.5) 
I VC — c2x2)(1 + e2x2) " 0 JU — e2x2)(1 + c2x2) 


He then gave a new proof for the addition formula for elliptic functions and used it 
to express #(na), where n was an integer, as a rational function of da, fa = /(1 — 
c2¢2a), and Fa = JOA + e*’a). This was analogous to expressing sin(nx) as a 
polynomial in sinx and cosx = \/(1 — sin? x). With n an odd integer, he noted that 


the rational function could be written as ee, 


where p and qg were polynomials in 


x = oa and of degree n* — 1. Abel next showed that the solution of the equation 
P(x) = 0, whose roots were 
aw+iba 
>) 
n 


for integers a and b, depended on an equation of degree n + 1 which could be solved 
algebraically only in particular cases. 

In the second part of his “Recherches,” Abel applied his theory to the division of 
the lemniscate.” Recall that Fagnano divided the full arc of the lemniscate in the first 
quadrant into two, three, and five equal parts. In his Disquisitiones, without reference 
to Fagnano, Gauss stated that the theory he had constructed for the division of a circle 
into n equal parts could be extended to the lemniscate, but he never gave details on 
this.° In the case of the circle, Gauss was able to simplify the problem, so that he had 
only to prove that cos (22 ) could be expressed in terms of square roots, when n was a 
prime of the form 2‘ + 1. He did this by showing that cos (72 =) satisfied an appropriate 
algebraic equation of degree ris = 2-1, 

Thus, in order to extend Gauss’s theory to the lemniscate, Abel had to find the 
division point by working with sl( #), where 2@ was the period of the lemniscatic 
function. However, note that sl(2) satisfies an equation of degree n? — 1, and this 
cannot be a power of 2, except when n = 3. This drawback would apparently suggest 
that Gauss’s theory for the circle could not be extended to the case of the lemniscate. 
But Abel found a resolution to this roadblock by discovering complex multiplication 
of elliptic functions.’ 

The primes expressible as 2 + 1, except for 3, take the form 4m + 1 and can be 
written as sums of two squares, 4m + 1 = a” + b*, where a + b is odd. Abel used 
this fact to solve the problem of dividing the lemniscate into n = 4m + 1 parts. He 
showed that the complex number sl( a —-) was the solution to an equation of degree 
n—1= 4m with coefficients of the form c+-id, where c and d were rational numbers. 
To prove this, he used the addition formula for sl @ to first prove that sl((a + ib)a) 
could be expressed as a rational function ~” oo) , x = sla, where p(x) and g(x) were 
polynomials of degree n — 1. Next, he employed the Lagrange resolvent, just as Gauss 
had done for the cylotomic case, to show that sl(—4 ) could be evaluated by means of 


atib 


5 Abel (2007) pp. 245-283. 
© Gauss (1965) p. 407. 
7 Abel (2007) pp. 245-255. 


258 Elliptic Functions: Nineteenth Century 


only square roots, providing n was of the form 2‘ +1. Abel pointed out that the value of 
sl(@) could then be found by means of square roots and so this value was constructible 
by straight edge and compass. The second part of the “Recherches” also dealt with 
the transformation of elliptic functions, but on this topic Jacobi had published earlier 
than Abel. 

Carl Gustav Jacob Jacobi (1804-1851) studied at the University of Berlin, though 
he largely preferred to study on his own, especially Euler’s works. His interest 
in elliptic integrals was aroused by the quadratic transformations in Legendre’s 
Exercices de calcul intégral. In June 1827, Jacobi communicated a short note to 
the Astronomische Nachrichten giving two cubic transformations and two fifth-order 
transformations of elliptic integrals. 

In fact, in 1825 Legendre had already discovered this cubic transformation, though 
it was published in the second volume of his Traité des fonctions elliptiques. Heinrich 
Schumacher, editor of the Astronomiche Nachrichten, noted that Jacobi did not refer to 
Legendre’s book, though this was not surprising, since Jacobi had not seen Legendre’s 
work at the time. In any case, Jacobi also had the new result on the fifth-order 
transformation. Then in August 1827, Jacobi communicated to Schumacher a general 
odd-order transformation, allowing the division of an elliptic integral into an arbitrary 
odd number of parts. Unfortunately, Jacobi included no proof, and so Schumacher 
consulted his friend Gauss about the correctness of the results. Gauss replied that the 
results were correct but asked Schumacher not to communicate with him further on 
this topic. Gauss himself was planning to publish his twenty-five-year-old results on 
elliptic functions and wished to avoid priority disputes. 

Schumacher published Jacobi’s notes but urged him to supply the proofs as soon as 
possible. Legendre saw the paper and was eager to see the proofs. Jacobi told Legendre 
that he had only guessed the theorem for odd-order transformations; in November 
1827 he was able to derive a proof by means of the inversion of elliptic integrals. 
Meanwhile, in September, the first part of Abel’s “Recherches” had appeared. It is 
curious that Jacobi did not refer to Abel’s paper and avoided the question of whether 
he had borrowed any idea from Abel. In 1828, the second part of the “Recherches” 
was published, in which Abel added an appendix explaining how his own results could 
prove Jacobi’s theorem. Jacobi’s proof was published after Abel had written the second 
part of his paper. Christoffer Hansteen reported that when Abel saw Jacobi’s inversion 
of the elliptic integral without reference to him, he was visibly shocked. In fact, 
Abel wrote in a letter to Bernt Holmboe that he published his “Transformations des 
fonctions elliptiques” in order to supercede Jacobi; he called the paper his “knockout” 
of Jacobi. In 1828, Gauss wrote Schumacher that Abel’s “Recherches” had relieved 
him of the duty of writing up a third of his investigations on elliptic functions. The 
other two thirds consisted of the arithmetic-geometric mean and the elliptic modular 
and theta functions. 

Abel’s early and tragic death in 1829 cut short the rivalry between Abel and 
Jacobi. In that same year, Jacobi published the results of two years’ labor on 
elliptic functions, in his Fundamenta Nova.’ This work presented an extensive 


8 See Jacobi (1969) vol. 1, pp. 49-239. 
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development of transformation theory and applied it to the derivation of series and 
product representations of elliptic functions, their moduli, and periods. The problem 
of converting products into series led Jacobi to the discovery of the triple product 
identity, though Gauss had anticipated him. In fact, these series and products were 
theta functions; thus, Jacobi had discovered that elliptic functions could be expressed 
as quotients of theta functions. 

Jacobi earned his doctoral degree from Berlin with a thesis on partial fractions 
in 1825 and a year later he took a position at KOnigsberg. Because of his sharp wit 
and tongue, Jacobi might have faced obstacles to advancement. However, he gained 
quick recognition from French mathematicians and Legendre in particular, who had 
presented Jacobi’s work to the French Academy in 1827. Finally, with the publication 
of the Fundamenta Nova, Jacobi became known as one of the most outstanding 
mathematicians in Europe. In a paper of 1834,° Jacobi proved two important theorems 
on functions of one variable: First, he showed that such a function could not have two 
fundamental periods whose ratio was real; secondly, he showed that such a function 
could have only two fundamental periods whose ratio was complex. He argued that the 
functions would otherwise have arbitrarily small periods, a condition he assumed to 
be absurd. Jacobi had not yet conceived of an analytic function, but when Weierstrass 
and Cauchy later confirmed his assumption, the proof relied on the fact that the zeros 
of analytic functions were isolated. 

Using a suggestion from Hermite that he use Fourier series, Joseph Liouville 
(1809-1882) in 1844 reproved Jacobi’s first theorem.!° This work initiated Liouville’s 
definitive theory of elliptic functions. According to Weierstrass, this work was very 
important, though Liouville published little of it; Weierstrass also criticized Briot 
and Bouquet for publishing Liouville’s ideas without giving him sufficient credit. 
Liouville’s innovation was to define elliptic functions as doubly-periodic functions, 
rather than as inverses of integrals. He showed that doubly-periodic functions could 
not be bounded and, in fact, had to have at least two simple poles. Except for 
two short notes, he did not publish these results, but in 1847 he began a series of 
lectures on this topic. These lectures were first published in 1880!! by the longtime 
editor of Crelle’s Journal, Carl Borchardt (1817-1880), who in 1847 had attended 
the lectures. Reportedly, Borchardt also showed the notes to Jacobi and informed 
Liouville that Jacobi was extremely impressed. A typeset manuscript of these lectures, 
said by Weierstrass to have been taken from the notes of Borchardt, was found among 
Dirichlet’s papers after his death in 1859. Apparently, Liouville had intended to 
publish the notes in his own journal, but had perhaps asked his friend Dirichlet to 
review the proofs. Why did Liouville not see to it that the proofs were published? 
This sequence of events remains a mystery, even after Jesper Liitzen’s comprehensive 
and detailed book on Liouville, published in 1990. It is interesting to note that the 
book by Liouville’s students, Briot and Bouquet, started with Liouville’s approach, 


9 Jacobi (1969) vol. 2, pp. 23-50. 
10 See Liitzen (1990) pp. 535-540. 
'] Liouville (1880). 
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but proved the results by using the complex analytic methods of Cauchy and Laurent. 
Many standard textbooks of today make use of these methods. 

At about the same time as Liouville, Gotthold Eisenstein (1823-1852) provided yet 
another important approach to elliptic functions. Eisenstein was dissatisfied with the 
inversion of the elliptic integral in Abel and Jacobi. He observed that, since integrals 
defined single-valued functions, the periodicity of their inverses must be problematic. 
Eisenstein was a number theorist of extraordinary vision. He viewed the theory of 
periodic functions as inseparable from number theory. In fact, in 1847 he published 
a 120-page treatise in Crelle’s Journal, developing a new basis for elliptic function 
theory, using double series and double products; this approach was well suited for 
number theoretic applications.!* The Weierstrass elliptic function ¢ (z) first appeared 
in this work. Although this paper was soon republished in a collection of Eisenstein’s 
papers, with a foreword by no less a personage than Gauss, it unfortunately did not 
receive recognition in the nineteenth century. In the preface to his 1975 book Elliptic 
Functions According to Eisenstein and Kronecker, André Weil brought this paper to 
the attention of the mathematical community. He wrote,!* 


It is not merely out of an antiquarian interest that the attempt will be made here to resurrect them 
[Eisenstein’s ideas]. Not only do they provide the best introduction to the work of Hecke; but we 
hope to show that they can be applied quite profitably to some current problems, particularly if 
they are used in conjunction with Kronecker’s late work which is their natural continuation. 


Weil’s treatment of Eisenstein is thorough and insightful as well as easily available. 
Thus, the reader may profitably consult Weil for Eisenstein’s 1847 work.!4 

Eisenstein’s objection to the inversion of the elliptic integral was addressed by 
Cauchy in the 1840s and then by Riemann in the 1850s.!> Cauchy had been vigorously 
developing the theory of complex integration since 1814; this work provided him with 
the tools necessary to address this problem. Riemann was familiar with Cauchy’s 
work, but he added his original idea of a Riemann surface to study Abelian, and in 
particular elliptic, functions. 


34.2 Abel: Elliptic Functions 


Abel’s great paper of 1827, “Recherches sur les fonctions elliptiques,” was published 
in two parts in volumes two and three of Crelle’s Journal. In this paper, Abel defined 
an elliptic function as the inverse of the elliptic integral !® 


xX 


a= | u , O< 
0 JU —c2x2)\(1 + e2x?) 


1 
x <S: (34.6) 
c 


12 Eisenstein (1975) vol. 1, pp. 357-478. 

13, Weil (1976) p. 4. 

14 Also see Roy (2017) chapter 4. 

5 For Riemann’s work on elliptic functions, see his lectures: Riemann (1899). 
16 Abel (2007) pp. 145-156. 
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He expressed x as a function of a and set x = ga. He noted that a was positive 


and increasing as x moved from 0 to i, and set 


xX 


ef g (34.7) 
2° Jo SO — c2x2)\(1 + e2x2) 


Thus, da was positive and increasing in0 < a < % and 
> => 


soma o(8)=! 


Moreover, since a changed sign when x was changed to —x, he had @(—a) = 
—(a). Abel then formally changed x to ix without a rigorous justification, just as 
Euler, Laplace, and Poisson had done earlier. Now in an 1814 paper published in 
1827!” and in papers published as early as 1825,!8 Cauchy discussed functions of 
complex variables in a more systematic manner. Abel could have employed Cauchy’s 
ideas to give a more rigorous foundation of his theory of elliptic functions. It is 
possible that Abel was not aware of this aspect of Cauchy’s work. In any case, with 
the above change of variables, Abel set 


x 


x d 
Saas _ 34.8 
aE OED ESE I Jd + 2x2) — ex?) ae 


and observed that 6 was real and positive for 0 < x < 2. He then set 


Ww] Se 


1 
é dx 
a i, (34.9) 
0 Jd — e2x2)(1 + c2x2) 
so that —id(67) was positive for 0 < 6 < @. he also had 
oi 1 
g (F) =-, (34.10) 
2 e 
Abel then defined two auxiliary functions 
fa=V1-—*¢2a, (34.11) 
Fa=,/1+e7¢2a, (34.12) 


and noted that when c and e were interchanged, f(ai) and F (ai) were transformed 
into F(a) and f(a), respectively. 


'7 Cauchy (1882-1974) Ser. 1, vol. 1, pp. 329-506. 
18 ibid. Ser. 2, vol. 15, pp. 41-89. 
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At this point, Abel observed that #(a) was already defined for —5 <a< a and 
(Bi) for —$ <B< ®- next he wished to define ¢ for all complex numbers. In order 
to achieve this, Abel employed the addition formula for ¢: 


pa IPT R Oe’ fare 


eer Cera (34.13) 


(a + B) = 


He also stated the addition formulas for the auxiliary functions fa and Fa, 
remarking that these formulas could be deduced from the results in Legendre’s 
Exercices but he wanted to give an alternative derivation. He first deduced the easily 
proved formulas for the derivatives: 


d'a = fa- Fa, fla = —c’ pa - Fo, Fla=e'oa- fa. 


Abel then let r designate the right-hand side of (34.13) and showed that 


dr (1 cp ad? A)L(e? — c) Gag + fafp FaFB|—207c?pabp(o?a + $B) 
da % (+ e2c*p2ag2B)* ‘; 


By symmetry in aw and 6, Abel concluded that 


dr dr 
da dp 
He observed that this partial differential equation implied that r = w(a + £) for 
some function y. Moreover, since (0) = 0, f(0) = 1, F(O) = 1, he set 6 = 0 in the 
expression for r on the right-hand side of (34.13) and found that r = @a. Butr = ya 
when 6 = 0. So he had da = wa or d = wW. This proved the addition formula. 
Abel deduced the periodicity of @ from the addition formula. He first set 8 = +9 
and B= + a in (34.13). Observing that iGa) = 0, and F(4%) = 0, he then 
obtained the formulas 


oo S)as0(9) Baal, 


42 wi oi Fa _ iFo 
(aa 3) =20(5) ata 


These results implied that 


(SF +e) =0(5 —«), (34.14) 
6(Si+e) =0($i -«), (34.15) 
6(a+2)o(a+ Si) =+-. (34.16) 
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Replacing w by a + $ in (34.14), and @ by a + oe in (34.15), Abel found 


g(a + @) = o(-a) = —¢a, (34.17) 
b(a + ai) = —da. (34.18) 


By means of these formulas, he defined ¢@a and ¢(ai) for all real a and then by 
the addition formula (34.13) he obtained ¢(a + Bi) for any complex value a + Bi. 
Moreover, from (34.17) and (34.18), it followed that @ was doubly-periodic with 
periods 2 and 2@i: 


$(2w + a) = —P(@ + a) = Ga, 
$(20i + w) = —$(@i +a) = oa. 


Abel also determined the zeros and poles of ¢. For example, from (34.16) he 
obtained 


Then by (34.17) and (34.18), 


eo dJeolort)e ns 


when m and n were integers. Then with a little more work, Abel showed that 
(m+ 5)o +(n+ 5)ai were all the poles of ¢. Similarly, he showed that mw + nai 
were all the zeros of ¢. 


34.3 Abel: Infinite Products 


Recall that one way of deriving the infinite product for sin x is to express sin(2n + 1)x 
by means of the addition theorem as a polynomial of degree 2n + 1 in sin x, factorize 
this polynomial, and then take the limit as tends to infinity. Abel applied a similar 
procedure to obtain the product for x. Abel deduced from the addition formula that 
for a positive integer n, 


29 (np) FB: FB 
1+ c7e?? (np) ob? Bp 


After some further calculation, he proved by induction that 


o(n+ DB =—o— DB 4 


p(2nB) = OB. fB.FB.T, p(2n + 1)B = $6.71, 
where T and 7; were rational functions of (¢f)?. He then wrote 


sortap=l (34.19) 
Qon+1 
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where P2,4; and Q2n41 were polynomials of degree (2n + 1)? and 4n(n + 1), 
respectively. He noted that the roots of Po); = 0 were clearly given by 


x=(-l™t#¢ (6 we -o4 a ai), (34.20) 


for —n <m, uw <n; by setting B = the roots were 


aos 
2n+1? 


re(C"H¢ a | mo + Uaoi 
2n+1 2n+1 


Abel next expressed $(2n + 1)6 as a sum and as a product of terms of the form 
(34.20). His method was similar to Euler’s derivation in the Introductio in Analysin 
Infinitorum, where Euler expressed sin(2n + 1)x as a product of terms of the form 


sin (x + wz). See Section 15.4 in this connection. Abel wrote 


Pong = Ax@PtD? 4... 4 Bx, 
2; 
Orn = CxOPt “1 4... 4D, 


so that by (34.19), he had 
(Ax@rtD* 4.0.4 Bx) = (Qn + DBACxOMtY“1 4... 4 DY. 


He observed that the highest-power term had coefficient A, the second highest term 
had coefficient —@(2n + 1)8.C, and the last term was —@(2n + 1)8.D. Then, since 
the roots of the equation were given by (34.20), the sum of the roots could be obtained 
from the coefficient of the second highest term and the product of the roots from the 
last term. Thus, he had the equations 


o(2n + 1)p=— 3 > (-1y"tHg (4 ee “ =“) (34.21) 
m=—n L=—n 
A se mw + Loi 
=5 Il IT] 6(o+ tee). (34.22) 


m=—n “=—n 


Abel set B = 5 + 8; +a, and let a — 0 to determine 


sages ; (34.23) 


He then let 6 — 0, to obtain 


Ano Lai 
Bene a ll? (ar) Oe (=) 
mo + Loi\ 4 (mw — Loi 
«TI Te a n+ )o( 2n+1 ). eee) 


m=1 u=1 
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This gave him an expression for 4 and he substituted it back in (34.22). To simplify 
the resulting product, he applied a consequence of the addition formula: 


2 
b(B ta)b(B— a) _ - 3 
oa OF fe 


$?(a+$+$i) 
Thus, Abel obtained 


$(2n + 1)B = Qn + 1)¢p Te ae le eT ee -Ga05) 


Rou Rn, u R 


a 
& 


m=1 p=l Ts 


where 


2(@ @: mw—poe \" 
? (3 z° In+1 ) 
He then set 6 = Pree let n — ov, and used the formula 


¢ (41) a? 


lim = 


N>OO 49 xX 2 
b (41) 


to obtain an infinite product for daw. Abel carried out several pages of calculations to 
show that the limiting procedure was valid and that the product converged to ¢a. It is 
not clear that Abel’s justification was complete. Anyhow, Abel obtained the formula!? 


m= a1] (1 Gra) TI (1+ Gay) 
ee) ee) a ee) a 
~ (mo+pai)2 mao—[La@i)2 
x IT] I]; a I] aa 
m=l =] 1 2 =] 1 
OE Doe DAy LE Do (Da) 


19 Abel (2007) p. 236. 
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Recall that in 1797 Gauss obtained a similar formula for the particular case of the 
lemniscatic function. Abel then expressed (34.26) in terms of sines, just as Gauss had 
done in 1798. This and other similarities in their work led Gauss to remark that Abel 
followed the same steps as he did in 1797. Abel next rewrote the double product in 
(34.26): 


1 (m-}) 0 ; 
Tl aa. ae oe WC 
mit eld) |, bey | + BE 


1 2S 1 oe 
(en)? oe 


Then by means of the product for sin x given by 
0° 2 
x 
sinx =x 1- ——~ }, 
I ( a) 


and using the addition formula for sine given by 


sin(a — b).sin(a + b) = sin? a — sin’ b, 
he obtained 
2 
‘Ss O° ae 
os Az, 
Ca mi I] foe He 
m=1 B2 


where 


' F 1 ; 
s = sin (=). Am = sin (“). Bm = COs ((m-5)). 
73) o 2) @ 


Finally, by the use of (ia) = ida, he obtained his product formula: 


14 4sin* (22) 

I ~ = 2 

_o at ee) ("8 -< "8" ) ee 
oa eee tein () (34.27) 


(2m—1)@2 _ Qm—l)or 7 
e Qa) +e (2a) 


Abel also used the series (34.21) to obtain various other formulas, including”? 


3a 5x 


(*#) == e2 _ an e2 . 3am _ ee? . Sam 
g = in rae i 


s sin sin 
2 w \l+e™ 2 1+ 37 2 1+ 7 2, 


20 ibid. p. 244. 
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34.4 Abel: Division of Elliptic Functions and Algebraic Equations 


In his 1827 paper, Abel considered Gauss’s algebraic theory on the division of periodic 
functions and extended it to the division of doubly-periodic functions. To understand 
Abel’s motivation, recall that from a study of Viéte, Newton determined that for an 
odd number n, sinnx could be expressed as a polynomial of degree n in sin x. Note 
that a similar result holds for cosmx. Gauss proved that these polynomials could 
be solved algebraically; Euler and Vandermonde had earlier done this for values 
of n up to eleven. Abel determined from the addition theorem that for a positive 
integer n, #(2n + 1)@ was a rational function of da such that the numerator took 
the form x R(x”), where R was a polynomial of degree n(2n + 2) and x = ¢a. His 
problem was to find out whether R could be solved algebraically, that is, by radicals. 
He discovered that he could employ Lagrange resolvents, an idea due to Waring, 
Vandermonde, and Lagrange, as Gauss had also done. However, Abel’s problem was 
more complicated than Gauss’s and took him deeper into the theory of equations. 

As a mathematical aside, we briefly discuss Abel’s related contributions to the 
theory of equations. His work in elliptic function theory gave him glimpses into 
the nature of algebraically solvable equations. In particular, he sought to determine 
solvability in terms of the structure of the roots of the equation. In an 1826 letter 
to Crelle, Abel stated a result on the form of the roots of a solvable quintic. He 
later generalized this result to irreducible equations of prime degree, published 
posthumously in the first edition of his collected papers of 1837. This paper contained 
the remarkable theorem that an irreducible equation of prime degree was solvable by 
radicals if and only if all its roots were rational functions of any two of the roots. Galois 
rediscovered this theorem a few years later, but his work arose out of a study of those 
permutations of the roots preserving algebraic relations among the roots. Because the 
group theory of algebraic equations, developed by Galois, gained recognition before 
Abel’s theory, based on structure of roots, Abel’s theorems have now become recast 
and known in terms of groups. It might be fruitful to make a parallel study of the two 
approaches. 

Recall that Abel proved that (2m + 1)8 was a rational function of x = 6 whose 
numerator took the form x R(x”) where R was a polynomial of degree n(2n + 2). 
Abel then proved the important theorem that the solutions of R = 0 depended on the 
solutions of a certain equation of degree 2n + 2 with coefficients that were rational 
functions of c and e. He proceded to demonstrate that if the latter equation could be 
solved by radicals, then so could R = 0. He went on to observe that, in general, this 
equation was not solvable by radicals but could be solved in particular cases, such 
as fore =c,e = J/3c, e=(2+ V3)c, etc. The case e = c corresponded to the 
lemniscatic function and had already been discussed in Gauss’s unpublished work, at 
least in special cases. 

Abel’s proof of this theorem was lengthy. We present a brief summary, using his 
notation. First note that by (34.25) and the fact that the zeros of @ occur at mw +inw, 
it follows that the solutions of R = 0 must be given by 
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By periodicity of ¢, the number of different values of r can be reduced to the 
n(2n + 2) values given by 


> We) > ( mwot+io 
= —— ], = ——— ], 34.28 
ry wy) (2) lym wy) (» mel ) ( ) 


where 1 < v <n, 0 <m < 2n. Now let a’ denote any quantity of the form mw+iuw 
and define y by the equation 


+(# (aaa) =°(# Gaga) (aaa) (aa) 
2n+ 1 Qn+1/)’ Qn+1)°> °°’ 2n+ 1 : 


(34.29) 


where @ is a rational symmetric function of the n quantities. It is clear from the 
definition of yy that 


+(°(at))-+(@(at)) terse am 


In particular, 


vr = ris Vm = Vim lsu sn. (34.31) 
The aforementioned equation of degree 2n + 2 can be given by 
(p— wri)(p — wrio)(p — wri): +: (P — ¥r1,2n) 


= go + Ggip + gap? te + gngip + pe, (34.32) 


It is easy to see that go, qi, ...,G2n+1 are rational functions of c and e. Note that the 
sum of the kth powers of the roots of (34.32) are symmetric functions of the n(2n + 2) 
roots r, andr, » of R = 0, where r, andr, », are given by (34.28). To see this, observe 
that 


(wri) = “ari + (wry) +--+ (brn), 
Win) = “rim! + (rom) +++++ (Wrnm)], O<m <2n, 
and 
(wri + Wbrio)§ + erik +++ + Wri2n)* 


1 
= — [brik + (brah + + orn] 
n 


1 
+= [sro + (ro) +--+ rn o)*| 


1 
+ =| (rian) + r2.an) ++ rman). 
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Since the coefficients of the polynomial R are rational functions of c and e, we may 
now conclude that each sum of the kth powers of the roots of the polynomial (34.32) is 
a rational function of c and e. Since the power sum symmetric functions form a basis 
for the symmetric functions, it follows that go,q1, ...,G2n+1 are rational functions of 
cande. 

Next, we show that if p = wr; and q = 97 are rational symmetric functions of 
1,12, ...,%,, then g can be determined in terms of p. Note that a similar result holds 
for Wri,m and Or). Fork =0,1,...,2n + 1, set 


sk = (Wri)"Or1 + (Wri) Ori,0 +++» + bri 20) Or 2n- (34.33) 
We prove that s; can be expressed as a rational function of c and e. Note that 
k k 1 k k k 
(wri) er = (wry)Or, = 7 Ler) Ory + (Wro) Org + +++ + (rn) Oral; 
k k i k k 
(Wr1,m) Orim = (Wrv,m) Ory,m = 7 Lerim) Ori,m free (Wrn,m) Orn, m). 


When these values are substituted in (34.33), we observe that s, is a symmetric 


rational function of the roots of R = O; therefore, s;, k = 0,1,...,2n + 1, are 
rational functions of c and e. We can apply Cramer’s rule to solve these equations 
for 6r1,Or1,0, ..-,971,2, in terms of rational functions of Wry, ...,Wr1,2n. This result 


in turn implies that the coefficients of the equation 


(r —r1)(r —12)++*(F — rm) =r" + paige” | + paar”? ++ + pir + po 
(34.34) 


can be determined by the equation (34.32). There are 2n + 1 additional equations of 
degree n with roots 71,,...,/n,» for 0 < v < 2n; the coefficients of these equations 
are also determined by (34.32). 

In this way, Abel reduced the problem of solving the equation R = 0 of degree 
n(2n + 2) to that of solving 2n + 2 equations of the form (34.34). We demonstrate 
by means of the Lagrange resolvent (Gauss’s method for solving the cyclotomic 
equation)*! that the solutions of these equations can be expressed in terms of the 
solutions to (34.32). Let 


re a Fe 20! Fe no! 
Qn+1/)’ Qn+1})> 2n+ 1 


denote the solutions of (34.34), where w’ stands for w or mw + i@. By a theorem of 
Gauss, there exists a number @ generating the numbers 1,2, ...,27 (modulo 2n + 1). 
Then by the periodicity of ¢, the set 


¢°(€), 6 (ae), $7 (ae), ..., 9? (a"1e), 


where € = so. represents all the solutions of (34.34). We omit Abel’s straightfor- 
ward proof of this result. 


21 See Neumann (2007b). 
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Now let @ denote any imaginary root of 6” — 1 = 0, and define the Lagrange 
resolvent 
We) = #6) + o(ae)8 + H'(a7€) ++ + GA" N€)O"™. (34.35) 


It is clear that y(€) is a rational function of $7 (€), expressible as y(€) = x (¢7(€)). 
By a simple calculation involving roots of unity, we can show that 


ware) =O "We) or we) =0"x(¢7(e'"e)), 


implying that (We)” = [x (¢?2(a’e))]". Taking m = 0,1,...,2 — 1 and adding we 
arrive at 


nwe)" =[x(#°©)]" + [x @e))]" +--+ [x@a@™'e)]". 4.36) 
The expression on the right-hand side of (34.36) is a rational symmetric function of 
ge, p(ae),...,7°(a"'€). 


That is, it is a rational symmetric function of the roots of (34.34). Therefore, 


(pe)" = v is arational function of po, p1,...,; Pn—1 and 
Vv = b'€ + OG" (ae) + 076" (a7€) +++» +0"? (a1). (34.37) 


Note also that v is a rational function of the roots of (34.32); so if (34.32) can be 
solved by radicals, then v can be expressed in terms of radicals. By changing @ to 
67,63,...,0"—! and denoting the corresponding values of v by v2,v3,...,Un—1, We 
have 


Yo = G6) + O'g* (we) +. FOV GPa" "6, k=1,2,...,.0-1. 
(34.38) 


When these n — 1 equations are combined with the equation 
—Pn-1 = P(6) + P(e) + + Ga"), 
we can easily solve these n linear equations to get 
1 
GME) = (Pn tO MYO +O Ya FOO Yona), (34.39) 


form =0,1,...,n —1. 
It can also be shown that 
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is arational function of po, p1,..., Pn—1- For this purpose, it is sufficient to check that 
sz is unchanged by « > ae. This gives us Abel’s final formula for ¢?(a’"e): 


n 


1 2 
eo? (a™e) = — (Prt 4 g7Myn + 59072 yn fives fb Sy 19- Oty =), 
n 
(34.40) 


for m = 0,1,...,n — 1. This implies that if v can be expressed in terms of radicals, 
then R = 0 can be solved by radicals. 


34.5 Abel: Division of the Lemniscate 


c 2n+1 
be obtained by solving an algebraic equation by radicals. When e = c = 1, Abel’s 


integral (34.6) is reduced to 


Recall Abel’s remark that in the case £ = 1, the division points $7 an ) could 


x 
a= ce i (34.41) 
0 


It is easy to check that 
p(ai) =i ba (34.42) 


and 


3 1 
d 
2 @ =a ee (34.43) 
2 2 0 V¥1—x4 
Abel applied the addition formula to show that for m + yz odd and x = 6, 
o(m + pwi)d = x(x"), (34.44) 


for some rational function w. Then by changing 6 to id and using (34.42), he obtained 
d(m+ pwi)d = xw(—x?), or Ww (—x?) = W(x’). He therefore concluded that?2 


d(m+ wild =x- - (34.45) 


where T and S were polynomials in powers of x+. This very significant result showed 
that the elliptic function @5 permitted complex multiplication, that is, @(m + pi)d 
could be expressed as a rational function of @5. As an example, he noted that 


[= 27 =x7 


o(2+1)6 = ix: T=t=2pe 


(34.46) 


22 ibid. p. 248. 
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a result proved by Gauss in an unpublished work, wherein he also divided the 
lemniscate into 5 = (2 + i)(2 — i) parts. 

Abel showed how (34.45) could be applied to the problem of dividing the 
lemniscate into 4v + 1 parts. By Fermat’s theorem on sums of two squares, Abel 
could write 


a? + B* =4v4+1= (a + if)(a — if), 


where a + 6 was odd. With m = a, = B, and 6 = THB? he could use (34.45) to 
obtain x = (6) as a root of T = 0. By using the periodicity of @ and the addition 
formula, Abel proved that 


+9(2,). +0(,) a 
? a+ip)’ ¢ a+ip)’ ? 2 at Bi 


comprised all the roots of the polynomial T. By setting T(x) = R(x7), he obtained 


$°(5), $°(25), #7 (35), ...,7(2v8) (34.47) 


as all the roots of R = 0. Next, Abel once again applied Gauss’s method. He first 
showed that for a primitive root € modulo 4v + 1 = a> + p?, the set $7 (eS), m = 0, 
1,...,2v — 1, was equal to the set given in (34.47). He then referred to the method of 
Lagrange resolvents to conclude that”? 


1 fs 
p(€'"8) = Aaj (A + (] m, ip + 520 2m A vs +... gogo Ce ‘ yi) ; 
v 
(34.48) 


where 6 was an imaginary root of 67” — 1 = 0, and v, sx were determined by the 
expressions 


v= Exo 4 sh? (Cb OG" (e278) ee Oe or(ertay] , (34.49) 


 €2(8) + Ok - 2(€5) +++» FOCPDK . G2 (27-15) 
Sk = [2(5) +6. 2(€6) 2. + 2-1, f2(€2¥-18)]k > (34.50) 
A = $°(5) + $7(€5) +--+» +67 (7°18). (34.51) 


Moreover, the expressions (34.49), (34.50), and (34.51) could be written as rational 
functions of the coefficients of R = 0. Recall that the coefficients of R = 0 took the 
form a + bi with a,b rational. Thus, v,s; and A were of the form c + id, with c and 
d rational. 

Abel then noted that if 4v + 1 = 1 +2”, then 2v = 2”~! and the values in (34.48) 


could be computed by repeatedly taking square roots.”* Thus, the values of ( aig) 


23 ibid. p. 253. 
24 ibid. p. 255. 
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could be evaluated by taking square roots; hence, by applying the addition formula, the 
value of b(a 47) could be so determined. This proved the result that the lemniscate 
could be geometrically divided into 2”+ 1 parts, when this was a prime number. Recall 
that in his Disquisitiones, Gauss had stated that this was true. 


34.6 Jacobi’s Elliptic Functions 


In his Fundamenta Nova, Jacobi presented a detailed account of his theory of elliptic 
functions. He inverted the elliptic integral?> 


u=[" dp -[ dx 
0 Jo Genre. Va—-2)d-Re) 


by defining the function x = sin amu, where ¢ = amu, calling ¢@ the amplitude of wu. 
He noted that, in general, any trigonometric function of ¢, such as cos @ = cos amu, 
tang = tan amu, could be defined in this manner. Jacobi worked mainly with the 
functions sing, cos ¢@, and 


: damu 
Aamu = V1 —k2sin? amu = ; 
Uu 


d 


Following Gudermann, we employ modern notation for these functions: sn uv, cn u, and 
dn u. When we emphasize dependence on modulus k, we write sn (u,k), cn (u,k), and 
dn (u,k). The complementary modulus k’, defined by k* + k’? = 1, is also important. 
Legendre denoted by K the complete elliptic integral obtained by taking x = 1 in the 
preceding integral; he denoted the corresponding complete integral for the modulus k’ 
by K’. 

Jacobi listed the addition theorems and related identities, results he obtained 
directly from those of Euler and Legendre: 


snucnudnu+ snucnudnu 


sn(u+v)= , 
D 
cnucnv — snudnusnvudnv 
cn(u+v) = , 
D 
dnudnv — k*snucnusnvucnv 
dn (u v= ’ 
D 
sn? u — sn2v 
sn(u + v)sn(u—v) = 7 , 


where 


D=1—Kk’sn* usn? v. 


25 Jacobi (1969) vol. 1, pp. 81-87. 
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Jacobi then extended the domain of the elliptic functions by applying the transfor- 
mation, later called Jacobi’s imaginary transformation: 


sing =itany. (34.52) 
This implied cos @ = sec yw and dg = aan and 
dod idw idw 


{1 —k sin? o y/cos? w + k2 sin? y {1 — WR sin? 
Jacobi used these this to write 


sin am (iu,k) = i tan am (u,k’), 
cos am (iu,k) = sec am (u,k’), 


tan am (iu,k) = isin am (u,k’), 


and other similar formulas. From these results, Jacobi deduced that sn(u,k) had 
periods 4K and 2iK’; cn(u,k) had periods 4K and 2K + 2iK’‘; and dn(u,k) had 
periods 2K and 4i K’. Moreover, in a period parallelogram, sn u had zeros at u = 0 
and at u = 2K and had poles ati K’ and 2K +iK’. Jacobi had similar results for cn u 
and dnw. 

We note an application of Jacobi’s imaginary transformation to the quadratic 
transformations discussed earlier. This will provide an introduction to the higher- 
order transformations appearing in the next two sections. Recall that Landen’s 
quadratic transformation 


(+k )xV1 — x2 
y= 


Sgt (34.53) 
produces the differential relation 
dy 25 os eae =, (34.54) 
Vd —y2)d —a2y2) of — x2) — kx?) 
1-k! 2h 


where A = Tae OO in other words, k = Tox" This algebraic relation between the 
moduli 4 and k is called a modular equation. By means of this relation, we may write 
(34.54) as 


(1+A)dy 2dx 


= : 34.55 
Jd —y2)0 —A2y2) 0 — x?2)(1 — k2 x?) : 


If we integrate the differential on the right-hand side of (34.55) from 0 to 1, we 


get 2K. However, as x increases from 0 to Tar y increases from 0 to 1; and as x 
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continues to increase to 1, y decreases from | to 0. Thus, if A denotes the complete 
integral corresponding to the modulus A, we get the equation 


21+AA=2K or K=(1+A)A. (34.56) 
Now note that the second quadratic transformation of Gauss 


(1 +A)y 
aaa ay? (34.57) 
produces the differential relation 


dz _ (1 +A) dy 
VJad-2)d—-y222) Jd —y)0 = Ay?) 


(34.58) 


where y = Ph We can therefore take y = k and apply (34.55) followed by (34.58) 


to obtain duplication: 


dz = 2dx 
J0-20-R2) Jd =x) — x2) 


One of Jacobi’s earliest discoveries was that there were, similarly, two cubic trans- 
formations, and when these were applied consecutively, they produced triplication. He 
then extended this to general odd order transformations. 

Jacobi’s imaginary transformation (34.52) when written in terms of x and y 
amounts to setting 


Se (34.59) 


When these expressions for x and y are substituted in Landen’s transformation 
(34.53), we obtain, after simplification, Gauss’s form of the transformation: 


Y (1+ k')X 
VI—¥2 J1— X20 — k2X2) 


or 


— A+k)xX 
LT 4RX2° 


Moreover, the differential relation (34.55) converts to 


(1+4)dY x 2dX 
Jd —¥2)0—22¥2) J — X20 — k2X2) 
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Observe that, since X and Y increase simultaneously from 0 to 1, this relation can 
be written in terms of complete integrals: 


IK Syn 


Dividing this equation by (34.56) gives another form of the modular relation, also 
used by Legendre: 


/ / 
ae = a, (34.60) 
K A 
As one might expect, when Jacobi’s imaginary transformation is applied to 
Gauss’s transformation, one obtains Landen’s transformation, except that k and 4 are 
converted to their complements k’ and i’. These results also carry over to general 
transformations. 


34.7 Jacobi: Cubic and Quintic Transformations 


In a letter of June 13, 1827,2 Jacobi communicated to Schumacher, editor of the 
Astronomische Nachrichten, two cubic and two quintic transformations. Jacobi’s first 
result stated: If we set 


sin y (ac + eae sin? v) 
sing = a age aS) (34.61) 
cC + Te ° 2 sin w 
we obtain 
d d 
ig ; = v ; (34.62) 
\3 = 3c o3 
Jere 55 (28) ome VS0= CP ey 
If, in addition, 
2 
sin 0 (—sac + (4%) sin? 0) 
siny = 34.63 
¥ aa — 355° . 43 sin? @ ( 
and 
a-—c (a+3c : 
= : 34.64 
a 2c ( 2a ) ( ) 
then we have 
d 3d0 
id (34.65) 


J1—xsin2¢ J1— xsin2o 


26 Jacobi (1969) vol. 1, pp. 31-33. 
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Note that (34.61) and (34.63) are Jacobi’s two cubic transformations, and when 
applied in succession, they produce the triplication (34.65) for the modulus k? = x, 
given by (34.64). 

Jacobi’s second result stated: If we set a> = 2b(1 +a +b) and 


sin w(1 + 2a + (aa + 2ab + 2b) sin® w + bb sin* w) 


sing = =a =a ; (34.66) 
1+ (aa + 2a + 2b) sin* yy + b(b + 2a) sin® w 
we get 
( ~ | ae 
Ja 2b)(1 +2a)2— (2 — a)(b+ 2a)? sin? a — 2b— bb(2 — a) sin? 
Also, if 
2-—a 
a= ; 
1+ 2a 
px b+2a 2-a 
TOP Qas aap’ 
_ 2-a (b+2a\* 
REDD GPO 
oe sin 6(1 + 2a + (aa 4 zai 2B) sin? 6 + BB ED) abi) 
1+ (aa + 2a + 28) sin* 6 + B(6 + 2a) sin™ 6 
then we have 
(34.68) 


i do =>) dé 
J1—xsin’¢ 4/1 —x sin? 6 


Here (34.66) and (34.67) are Jacobi’s quintic transformations, and they together 
produce the quinsection given by (34.68). 

In his letter to Legendre of April 12, 1828,7’ Jacobi wrote that he found (34.63) and 
(34.67) by trial and error. But he explained that he had found the cubic and quintic 
transformations (34.61) and (34.66) on the basis of the general algebraic theory of 
transformations he had developed in March 1827. For this theory, he considered the 
transformation y = ¢ where U and V were polynomials in x differing in degree by 
at most one, and such that 


dy _1 dx 
VY MJX 
where X and Y were quartics in x and y, respectively, and M was a constant depending 


on the constants in X and Y. In particular, he took X = (1 — x?)(1 — k*x?) and 
Y=(- y?)(1 — Ayr y. By substituting y = v in (34.69), he obtained the relation 


(34.69) 


27 ibid. pp. 409-416. 
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dy (Vee — UE) dx 


Va yd = 29?) VV? = 02)(V? = 0202) 


He noted that if U and V were of degree p, the numerator of the expression on the 
right was a polynomial of degree 2p — 2, while the expression inside the radical was 
of degree 4p. Moreover, since for any number a, 


dU d(V — aU), _ dU dV 
dx dx dx dx 


(V —aU) 


it followed that if any of the factors V + U, V + AU in the denominator had a square 
factor (1 — Bx)*, then 1 — Bx was a factor of the numerator polynomial. Thus, if the 
denominator was of form T*X, where X was a quadratic and T was of degree 2p — 2, 
then 


was a constant depending only on the constants in X and Y. Jacobi noted that the 
problem of finding y = e was determinate because © had 2p + 1 constants of which 
2 p—2 could be determined by requiring that (V?—U*)(V7—4*U7) = T°X. This left 
three undetermined constants, and that number could not be reduced because x could 
be replaced by thx , resulting in a similar relation. Thus, he looked for polynomials 
U and V such that V+ U = (14+ x)AA,V-—-U = (1—-x)BB,V+AU = (14+ 
kx)CC, V —AU = (1 — kx)DD. He also noted that y was an odd function of x, 
and hence U = x F(x?) and V = (x7). Moreover, the equation (34.69) remained 
invariant when y was replaced by a and x by i This observation allowed him to 
determine explicit algebraic relations between k,A, and the coefficients of U and V. 
In particular, it was possible to obtain for small values of p (the degree of U) the 
explicit algebraic relations satisfied by k and 4. These relations are called modular 
equations. So if either k or A is given, the other can be found as one of the roots of this 
equation. The value of M can also be determined. It can be proven that if p is an odd 
prime, then the modular equation is irreducible and of order p+ 1. Thus, for a given k, 
there are p + 1 different values of 4 and each one leads to a distinct transformation 
of order p. We note that Legendre and Jacobi took k* to be between 0 and 1. The 
modular equation gave p + 1 values of A of which two were real, one greater than k 
and the other less than k. Jacobi denoted the smaller value by A and the larger by 11; 
he called the transformation with the smaller A the first transformation and the other 
the second transformation. He noted that when the two transformations were applied 
one after the other, the result was a multiplication by p of the differential. So if y = G 


was the first transformation and z = aE the second, then 


dz _ pdx 
Jd-2d-P2) Jd =x — x) 
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Jacobi worked out the algebraic theory of transformations only for the cubic 
and quintic cases; he needed the theory of elliptic functions to develop the higher- 
order transformations. He called this the transcendental theory of transformations. 
He may have obtained from Abel the idea of the elliptic function as the inverse of 
an elliptic integral, though he unfortunately never discussed this question. Jacobi did 
not go deeply into modular equations; the algebraic theory of modular equations was 
developed, starting in the 1850s, by Betti, Brioschi, Hermite, and Kronecker. 

In the Fundamenta Nova, Jacobi gave details of how he found the first cubic 
transformation. First set a = 2a + 1 in (4.61). Then the transformation would take 
the form: If 

x(2a + 1+ a?x7) 


*= 4 aa + 2)x2)’ el”) 


then 
dy _ (2a + 1) dx 
Ja—y)0—22y2) V0 — x2) — 2x2)’ 


(34.71) 


2 _ a Q+a) 2 _ a(2+a)3 
where k* = —j7>— and A“ = Cat)?" 
Recall that since U = x F(x”) and V = $(x?), to derive this cubic transformation, 


Jacobi could take?8 


V=1+bx* and U=x(a+ayx’). 


He then assumed that A was of the form 1 + ax so that 


V+U =(14+x)AA = 14 (14 2a)x +002 +.0)x? + aa, 


By equating the powers of x, he had 


b=a(2+a), a=1+2a, and a, =a’. 


Note that this gives the preceding cubic transformation (34.70). To find the 
algebraic relation satisfied by k and 1, he changed x into ie and y into wy in (34.70) 
to get 


Ax ((2a + 1)a? + ax") kx(a(a +2)+ Kee) 
a? + a3 (a + 2)x2 a2 + (Qa + 1)k2x2 * 


By equating coefficients of various powers of x, Jacobi found 


’ 


2a +1 a8 


2 2 A+) er ae 2+a : 
7 a8 Na +1) * 


28 ibid. pp. 74-75. 


280 Elliptic Functions: Nineteenth Century 


The complementary moduli were then given by 


_G=# +a .y_ U+a)(l-a)° 


a1 : = 
2a+1 (2a + 1)3 


Observe that this immediately gives the modular equation VkA + Vk’. = 1. 
Moreover, he noted that with D = 1+ a(a +2)x?, 


‘ _— =x) -ex)? ra 
y= D ’ ry 


_ d+x)d+ex)? 
7 D 

(l+kx)(+ &) 
; rAY = ; 
D D 


’ 


1—kx)(a — &) 
potent 


and hence he arrived at the transformation (34.71): 


dy 7 (2a + 1) dx 
Jad —y)0—2y2) J —x2)0 — 2x2) 


Jacobi wrote the modular equation in a slightly different form, by setting ka =u 
and 44 = v, to get 


ut — v* + 2uv(1 — u2v”) = 0. (34.72) 


He showed how to obtain the second transformation from this modular equation. 


He first wrote (34.70) in terms of u and v by observing that a =a‘ ora = za Then 
(34.70) and (34.71) could be rewritten as 


v(v + 2u3)x + u®x? 
= : 34.73 
aT ge v3u2(v + 2u3)x? ( ) 


d 2u3 d 
u Py iia - (34.74) 


Ja-—y)d—v8y2) ou J = x20 — 8x2) 


Jacobi then observed that the modular equation remained unchanged when u and v 
were changed to —v and u, respectively. This gave him the second transformation 


u(u — 2v3)y + voy3 
— : 34.75 
ae + u3v2(u — 2v3)y2 ( ) 


dz u —2v> dy 
a . (34.76) 
Vd — 22)(1 — u8z?) ue JA = yA = v8y2) 


By the modular equation 


(4) (“ =) 3 2(u* — v4) +. uv(1 — 4u?v?) _ 


7) 


u uv 
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he obtained triplication formula 
dz = —3 dx 
VES ee) s/s) 


To get +3 instead of —3, it was sufficient to change z to —z. 
In the case of the quintic transformation, Jacobi set”? 


V=1+ bx? + box, U=x(a, + anx? + a3x*), A=l+ax+ Bx’. 
From the equation V + U = (1 + x)AA, he found 
bj =2a+2B+aa, by = B(2a + 8B), 
a,=1+2a, ag=2B+aa+2ap, a3= Bp. 


He gave the details in section 15 of his Fundamenta. He presented the modular 
equation in the form 


u® — v° + 5u2v*(u? — v*) + 4uv(1 — utv4) = 0. 


In 1858, Hermite used this relation to solve a quintic equation, just as Viéte solved 
a cubic by means of trigonometric functions, 


34.8 Jacobi’s Transcendental Theory of Transformations 


Euler, Legendre, and others were aware of the fact that the addition formula for 
elliptic integrals solved the problem of the multiplication or division of an elliptic 
integral by an integer. In transformation theory, the multiplication was accomplished 
in two steps. The first step was to apply a transformation that gave a new elliptic 
integral with a modulus A* smaller than the original modulus k*. This was followed 
by a second transformation serving to increase the modulus. Jacobi discovered these 
facts about transformation theory by the summer of 1827, at least in the cases of 
the cubic and quintic transformations. To develop the theory in general, he had to 
invert the elliptic integral and work with elliptic functions. In his December 1827 
paper, however, he gave only the first transformation because he did not define elliptic 
functions of a complex variable.*° It was after he introduced complex periods in the 
spring of 1828 that he was able to develop the complete transformation theory as 
presented in his Fundamenta Nova. He explained how the two transformations arose 
and also the manner in which they were related to the complementary transformations. 
To obtain a glimpse of the general theory, we consider the cubic transformation in 
some detail from the transcendental viewpoint. For the most part, we follow the 


29 ibid. pp. 77-79. 
30 ibid. pp. 39-48. 
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exposition from Cayley’s Elliptic Functions.>! 
work in streamlined form. 

It can be shown by means of the addition formula for the elliptic function 


x = sn(u,k) that if z = sn (3u,k), then 
x? x2 
=) (1-8) 


3x (1 7) (1 7) (1 
= 2 (34.77) 
Zz —= ’ . 
(1 — kat x?)(1 — ka5 x?)(1 — k?aZ x2) (1 — kaj x?) 


in which he also presented Jacobi’s 


4K 4iK’ 4K + i4K’ —4K +i4kK’ 
where a, = sn —, a2 = sn —, a3 = sn —————_, a4 = sn ——————_- 


3 gfe ae 3 


Also, it follows from a formula of Legendre that a), a2, a3, a4 are the roots of 


3 ACL +k?) x? + 6k7x4 — kAx8 = 0. 

Note that Legendre knew that (34.77) was an integral of the differential equation 
dz _ 3 dx 

V0= 20-2?) JA = x) — x?) 


Now from Jacobi’s algebraic theory presented in the Section 34.7, it follows that 
the first transformation has the form 


(34.78) 


(34.79) 


where M is to be determined. Recall that Jacobi required the existence of a polynomial 
A such that V — U = (1 — x) A? where y = v. This means that the value x = 1 can 
be required to correspond to y = 1. Taking these values for x and y in (34.79), we 
see that 


2 
l-a 


~ atl — a2)’ 


Cava (1 = 7) 
1 
D b 


1-y= 


where D = | — kak We can rewrite the numerator of 1 — y as 
aemeny ieee eee x 
x u x Ma ; 


31 Cayley (1895) pp. 206-210. 
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Now let A = ari so that, for consistency, we require that 


1 so 2 - 
1 1}x 5 =1 X+ =. 
M Ma; po ot 
Equating coefficients, we get 


2 1—k ai 1 1 1— kat 
goa ON. a a 
fi Loa if Ma; 1a; 


(34.80) 


These relations are consistent because, by the addition formula and periodicity of sn u, 


8K 4K — 2sn(4€)cn (4£)dn (44) 
sn =-—sn — = ; 
5 ONS = ant 


or 2/1 —a3,/1 — 2a? = -(1 — Pa). 
Hence, 


(+2) + 4)? 
= 5 


ae, 


The next step is to determine A by using the invariance of the transformation (34.79) 
under the change x to i and y to oe This gives 


k3 is a’)? 

= 2234 _ 1 
1 

Note that since a, is real, we have | — a; <l- kat and A is smaller than k. It is also 

easy to check that 


ahh ay p55 (l+tkx)(1+kfx)? 


ee ; 
- D D 


(34.81) 


It follows that 
M dy 7 dx 
Va-y)0 =?) Y=?) — x?) 


(34.82) 


Then, by means of an algebraic calculation, obtain /Ak + JVA‘k’ = 1. 
Now for the second transformation, we require that if it is applied after the first, we 
get triplication. Note that (34.79) implies (34.82). Therefore, we want a transformation 


(34.83) 
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such that 
dz _ 3M dy 
ViI=2)0- #2) Ja —y?)0— 0 y?) 


Thus, (34.78) must hold in this case. Next note that if the value of y given by (34.79) 
when substituted in (34.83) were to produce (34.77), then (34.78) would hold true. 


(34.84) 


Moreover, it can be shown that if we take 9 = — ae , then 
1 
x x x 
Ue) Ss) Cae) 
Q D : 
ay (1 — kagx)(1 — ka3x)(1 — kagx) 
y D is 


and there are similar formulas for 1 + *, 1+ A@y where the sign of x is changed. So 
this value of 6 in (34.83) indeed produces the desired result (34.84). Moreover @ is 
related to i as a, to k, that is, 0 is a solution of 


3—4(1 +.a)o? + 60764 — 1468 = 0. 


In fact, it can be shown that 6 may be taken to be the purely imaginary value 


4iK’ 
aa=sn 5 
é 3 


This implies that 67 is real and negative and that (34.83) is a real transformation. 
Transformations similar to (34.79), wherein a, is replaced by a3 or a4, contain 
complex numbers. 

In general, for an odd integer n, Cayley gave Jacobi’s transformation formulas in 
the form?? 


2 
— u TTs=1 (1 ~ <ixs) 
[pte (1 — k2(sn22se)x?)’ 


y 


where m and m2 were integers and 


mK + mi K' 


n 


Denoting the denominator on the right-hand side by D, he showed that under the 
conditions 


n 
=k" I] sn* (K — 2so), (34.85) 


s=l1 


32 ibid. pp. 251-255. 
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: kin 
ae T=, dn*@sa)’ 
s=l 


n 


n—1 2 K —D 
M=(-1)7 J] ree (34.86) 
s=l 


the expressions for 1 — y, 1 + y, 1 — Ay, 1 + Ay were consistent with each other and 
with the expression for y: 


n Z 
Xx 

a ea ag (1 7 xs) : 

n 7 2 


(1 —Ay)D = (1 —kx) T](1 — kxsn(K — 280)’, 
s=l 


(L+ay)D = (1+kx) []d + kxsn(K — 2s0))’. 


s=1 


These equations implied the differential equation (34.84). He also rewrote the 
transformation formulas in the form 


u snu x sn? Uu 

sn en x) == (11 (1 sas) - D, (34.87) 
u ,\= " 1 sn? u “Dp 

a ar ) aac I] ( sn2(K — =a) Pe 


s= 


dn (—.) = snu (I: — k°sn?(K — 2s@)sn7 ») =D, 


s=l1 
where 


n 


D= ate — k?sn?(2s@)sn* u), and snu =sn(u,k). 
s=l 


The real transformations corresponded to the cases w= A and w' =! Then, 
by applying the imaginary transformation, Jacobi obtained the transformations 
for the moduli w’ and i’. This meant that the transformation for x. that is the 
first transformation, was converted to the form of the second transformation, arising 


sol 
from x 
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Cayley presented Jacobi’s relation between the complete integrals K and A by 


observing*? that for @ = x the least positive value for which sn(45,A) vanished in 


(34.87) was given by 77 = 2A, while on the right-hand side it was given by u = 2K 
Hence, 
2K K 
2MA = — or —=A (34.88) 
n nM 


Note that this relation came from the first transformation, since w was taken to 
be x. Jacobi denoted the value of M by M, in the second transformation, where w 
was taken to be ES Since sn*(2sw) was negative in this case, Jacobi noted that the 
smallest value of u for which the right-hand side of (34.87) vanished was given by 
u = 2K. Hence, he obtained 


K 
2M, A, =2K or MM = Ay. (34.89) 


On the other hand, the transformations for the complementary moduli gave Jacobi 
the relations 


/ / 


,_K = 
A’=— and A,= / 
M nM, 


(34.90) 


The first relation combined with (34.88) produced the modular equation 
A’ K’ 
Mgr 

while the second together with (34.89) gave 


/ / 
Ki NM 
> =n. 
K A 

Jacobi also found transformations easily derivable from the first and second 
transformations; he named these supplementary transformations and used them to 
obtain product expansions for elliptic functions. For example, he started with the 
second transformation 


u snu Z sn? u Ht sn2 u 
ae (a) ~ M, (11 ( aie) . I(1- akin) 
n n 


£1 s=1 
(34.91) 


where snu = sn(u,k). He changed k into A so that A; then changed to k. Denoting the 
new value of M, by M’, he had the relations 


K a | 1 
Mes WS Se or ae 
A K nM MM’ 


33 ibid. pp. 261-281. 
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Then, replacing uw by 7 in the transformation obtained after changing k to A, he 
reached his first supplementary transformation 


N 
sn(nu,k) =nMsn (—.a) ey (34.92) 
M D 
where 
E sn2(4,A 
Nes Lis va 
s=l1 sn?( i A) 
and 


Z sn?7(4, A 
d= I] ( 2 ee ay). 
s=l ee ( n ? ) 
Similarly, Jacobi had formulas for the functions sn and dn. 


34.9 Jacobi: Infinite Products for Elliptic Functions 


In 1828-1829, Jacobi obtained his initial infinite products for the elliptic functions sn, 
cn, and dn.** To do this, he took his order n supplementary transformations for these 
functions, such as (34.92) for sn, and let the integer n tend to infinity. He noted that 
since k? was less than 1, it followed that k” tended to zero and hence by equation 
(34.85) 7 = 0, am(u, A) = wu, and sn(6,A) = sin@. This then implied that the 


corresponding complete integral A was equal to 4. Moreover, since by (34.88) and 


, 
(34.90) A = - and A’ = X,, it followed that 


2k A’ kK’ Heh 


M= ; = = : 34.93 
: WU n nM 2K ( ) 


Jacobi also had 


and he set 
u : Tu 
sn (—. 2) = sin (=) =y. 
M 2K 
Replacing nu by u in (34.92), he let n + oo to obtain the product formula: 
2 2 2 
y y y eee 
- ma y? y2 y? oy . 
(: sin? x) (1 sin? =x) (1 sin? =) 


34 Jacobi (1969) vol. 1, pp. 141-146. 
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In a similar way, he got 


chu =,/l—y “ , (34.94) 
2 2 2 
( 2 wiK ) ( - 2 car) (1 2 5aiK ) 
2 sin” "OK 
and 
(1 Ss) (1 | (1 — at 
COS COs COS 
dnu = zs an = (34.95) 


1 x 1 ba 1 y 22 ees 
/ / 
sin2 mie sin2 La sin2 Smik 


Recall that Abel obtained his similar product formula (34.27) for @a using a different 


é _ ak! ohn 
method. Jacobi then set e~ K = q,u = —=,and y = sinx to obtain 
iK! m_.»~—m i(1 — 2m 
gee, (34.96) 
K 2i 2q™ 
iK’ m —m 1 2m 
cos eee ae ae = ae ; (34.97) 
K 2 2q™ 
2 2m oty2 2m 4m 
4 sin 1-2 cos 2x + 
1-— _, =14 SS = et (34.98) 
sin? MR (b=aq2") (igor) 
2 2m oty2 2m 4m 
4 1+2 2x + 
1 y inti qd sm x. + 2q~" cos2x + q (34.99) 
cos2 wk el + q2m)2 (1 + q2m)2 


He was then able to rewrite the products as 


2Kx 2AK , 
sn = sin x 
ue 
(1 — 2g? cos 2x + q*)(1 — 2g* cos 2x + g8)(1 — 2g® cos 2x + q!”)--- 
(1 — 2g cos 2x + q?)(1 — 2g3 cos 2x + q®)(1 — 2g> cos 2x + q!°)---’ 
2Kx 
cn = B cos x 
(1 + 2g? cos 2x + g*)(1 + 2g4 cos 2x + g®)(1 + 2q® cos 2x + q!*)--- 
(1 — 2g cos 2x + g?)(1 — 2g3 cos 2x + g®)(1 — 2q> cos 2x + g!9)---’ 
2K 
dn" =C€ 


(1 + 2g cos 2x + q”)(1 + 2g3 cos 2x + q®)(1 + 2g> cos 2x + q!®).-- 
(1 — 2g cos 2x + q?)(1 — 2g3 cos 2x + q®)(1 — 2g> cos 2x + q!)-.-° 
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Here 


i , 
(l= 9g Gg y= ghee" 


pe 
(+47) 4+q*)(L+q%)---J ? 
_ 4 ee) ee le 
c={F gd ¢)1—g¢-) 


a} 
V1 —-k2 =k’, 


Jacobi set x = 4 and observed that since sn K = 1 anddnK = V1—k*sn?K = 


= O- CSC or Cave, 
To rewrite these formulas in a more useful form, he changed x to x + ‘a £ ’ in the 
first equation, so that by the addition formula 


2Kx” 
sn —* 
wK! aK 
x K' eit kK +e 2ix- K 1 ‘3 1 ; 
cos2{x+i = = gree |3 
( oe) 2 x(4 +9 
aie+ 8h) Sige 


Note that the first product formula could be written as 


ei — e-ix 
2i 


q 2ix)(] gre *)\(1 = qre**)(1 _ Ge a) oe 
(1 = ge"*)(1 ees ge—2x)(1 oe qre2*)(1 om gee—2*) fed 
Observe that after applying x > x +i 


IU 


2Kx 2AK ( 
sn = 
1 


AS 


(34.100) 
= the formula would become 


—ix 


2AK [ Jte* — Ge 
k sn 2K* ee: 


2i 


(1 = gre *)(1 = ge) (1 = get*) _o gee 2*) an 
al = qzex)(1 = e~2ix)(1 ae qte2ix)(1 _ qze72*) as a 


(34.101) 
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Multiply equations (34.100) and (34.101) to obtain Jacobi’s result 


1 1 fAR 4 
7- (>) or eS 
b fGe\ at Jk K 


Jacobi then set x = 4 in (34.100) and applied C = /k’ to get 


1 


_ 24K (“ red tqd-4 Oe 2VKAK 
7 (+qgd+q)l+q>)---J oo zB 


Jacobi was then in a position to rewrite the products: 


ae 


2Kx 2qt _ (1-29? cos 2x + g*)(1 — 2q* cos 2x + g8)--- 
sn = sin x , (34.102) 
1 Vk (1 — 2q cos 2x + q?)(1 — 2g3 cos 2x + q®)-:- 
2Kx k! 1 cos x(1 + 24? cos 2x + g*)(1 + 2g* cos 2x + q*) +: 
cn — . ; 
1 more (1 — 2g cos 2x + g?)(1 — 2qg3 cos 2x + q®)--: 
(34.103) 
2 3 6 
dno ae (1 + 2q cos 2x + q*)(1 + 2q” cos2x + q ay (34,104) 
4 (1 — 2q cos 2x + q?)(1 — 2q3 cos 2x + q®)--: 


Thus, from the products for A, B, and C, Jacobi had infinite products for oA k’ 
and k: 
2 2 
2K -{¢-00- pean] | (d+gd+q)d4 | 
Qi aye). (+47) +q)U+q®)---J ? 


tes (« ga—4q)d ae 
Gg) (lg yd +g?) e+ J) 


Uv 


d+q)d+qhi 4 ae 


dg +7). 


k=4va| 


After obtaining ok as an infinite product, Jacobi applied the triple product identity 


to express this product as a theta series:*> 


2K 
pf SED eg ge gE aes (34.105) 
us 


This formula laid the basis for Jacobi’s results on the sums of squares, two of which 
reproved theorems of Fermat. 


35 ibid. p. 235. 
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34.10 Jacobi: Sums of Squares 


In 1750, Euler suggested that problems on sums of squares could most naturally be 
studied through the series whose powers were squares. In his 1828 paper on elliptic 
functions,*© Jacobi followed Euler’s suggestion, with great success. Though primarily 
an analyst, Jacobi had a strong interest in number theory, leading him to perceive that 
his famous formula (34.105) could be employed to obtain Fermat’s theorems on sums 
of two and four squares. In fact, Jacobi also found analytic formulas implying results 
for sums of six and eight squares. He arrived at all these results, including (34.105), 
through his product expansions of the doubly-periodic elliptic functions. Recall that in 
an analogous manner, Euler evaluated the zeta values at the even integers by means of 
the infinite product expansions of the singly periodic trigonometric functions. Also 
note that the period K of the elliptic function was obtained as a value of a theta 
function, while, as in the Madhava—Leibniz formula, the period z of a trigonometric 
function was expressed as a value of an L-series. 

To derive the formulas necessary to work with sums of squares, Jacobi first took 
the logarithmic derivatives of the product expansions for the elliptic functions sn, cn, 
and dn. First note 


log(1 — 24” cos 2x +47") = log (1 — qe"*) + log (1 — ge?) 


Im 


_ 3 q'™ cos 21x 
= ; . 


l=1 


21 ge a 
I+q! 


Combining this relation with the geometric series 1 — g' + q gives us 


Jacobi’s formulas; he simply wrote them down without details: 


x 24¢~. 2qcos 2x  2q*cos4x  2q> cos 6x 
log sn = log sin x¢ ad ao 
Vk l+q 2 +47) 3(1+4°) 
2K k/ 2 2x  2q*cos 4x  2¢3 cos 6 
igen eae Daigle Soret q COS i q* cos ae q° cos Bie 
k l—q 21+q4?) 30-43) 
sean re 4q cos 2x | 4q? cos 6x | 4q> cos 10x ot: 
lag? ag). Sag) 


To obtain the derivatives of these formulas, Jacobi observed that 


(34.106) 


2K 
d (sy 2K'K en ( kx) 
— logsn = ; 
dx Ten (x — 2K) 


36 ibid. pp. 255-263, especially p. 262. 
37 ibid. pp. 155-170. 
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2Kx 2K ot (24) 
as 
—— logen ( ) — (34.107) 
Te ss (x = Ax) 
d 2Kx 2k°K = (2Kx 2Kx 
—— log dn = sn sn| K — ‘ (34.108) 
dx ue 1 1 od 
Thus, he obtained 
2k! K on 2Kx 4gsin2x 4q?sin 4x  4q3 sin 6x 
. = cot x 5, 
tsi = 2Es) l+q er 1+¢@ 
2K sn 24x 4qsin2x  4q?sin4x  4q3 sin 6x 
: = tan x4 eS 
mx sn(K — 2K* l-gq 1+q? Lg? 
2k7K =-2Kx 2Kx 8qsin 2x  8gq>sin6x  8q>sin 10x 
—— sn —— sn | K = [2 rakes 
TU I WU 1—q? 1 — q® 1—q!0 
Note that when x = 7 in the second equation, we get the Lambert series for 2K 
2K _ 1, _4¢ 4q° | 4q° 
a q 1-4q@ = gq 


Also, the derivative of the second equation at x = 0 gives us the Lambert series for 
the square of 2K: 


2K)" _,, 89 , 169? | 2g? 
1 Pag * baage age : 


By further manipulation of the products, using differentiation and series expan- 
sions, Jacobi obtained formulas for the cubes and fourth powers, as given in sections 
40-42 of his Fundamenta Nova: 


) 2n—-1 


2K\? nq” = _, Qn — 1)*q 
(*) ns See 4) \(-1)" a (34.109) 


r=] 


INS n>q” 
(*) = Tar (34.110) 


The reader may observe that, by expressing the Lambert series in the last four 
equations as power series in g, we obtain the number of representations of an integer 
as the sum of two, four, six, and eight squares. 

In the final paragraph of his Fundamenta, Jacobi gave a number theoretic interpre- 
tation of his analytic formula for the sums of four squares, but he did not write down 


34.10 Jacobi: Sums of Squares 293 


interpretations for the other formulas. In 1865, Henry Smith gave these explicitly, in 
sections 95 and 127 of his report on number theory:*® 


The number of representations of any uneven (or unevenly even) number by the form x24 y2 is 
the quadruple of the excess of the number of its divisors of the form 4n + 1, above the number of 
its divisors of the form 4n + 3. 


The number of representations of any number WN as a sum of four squares is eight times the sum 
of its divisors if N is uneven, twenty-four times the sum of its uneven divisors if N is even. 


sl 
The number of representations of any number N as a sum of six squares is 4 )*(—1) 2 (46” Ps 
87), 5 denoting any uneven divisor of N, 5’ its conjugate divisor. In particular if N = 1,mod4, 


b-1 5-1 
the number of representations is 12 }*(—1) 2 ; if N = —1, mod 4, it is —20 )\(—1) 2 32, 
The number of representations of any uneven number as a sum of eight squares is sixteen times 


the sum of the cubes of its divisors; for an even number it is sixteen times the excess of the cubes 
of the even divisors above the cubes of the uneven divisors. 


In his July 1828 paper in Crelle’s Journal, Jacobi gave a beautiful application of 
(34.105) to derive a very efficient proof of the transformation formula for a theta 


function:°? 


(34.111) 


lox 


ie Be eae 7) gee eae Ym ee 
* ly ae ey ge i ee OP ice 


Cauchy found this in 1817, and Poisson did so in 1823, though Jacobi referred only 
to Poisson. Jacobi observed that if the moduli k and k’ were interchanged, then K and 
K’ would also be interchanged. Thus, with x = 5, (34.105) implied 


2K’ 4 TT T TT TT 
[Gers 4 Oe ede Ee os eee ee 


ve 


Dividing (34.105) by this equation gave him the required transformation. As we 
shall see in the next section, in 1836 Cauchy applied (34.111) to evaluate a Gauss 
sum, and in 1840 he provided a more succinct argument. 

It is interesting to note that Euler foresaw, albeit vaguely, Jacobi’s manner of proof 
for the four squares theorem and the importance of the transformation of the theta 
function. In a letter to Goldbach of August 17, 1750,*° Euler discussed the series 


1 x+x4 x? + x16 fees: 


He wrote that he had approximately evaluated to several decimal places this series 
for values of x close to 1, a remarkable calculation since the series is very slowly 
convergent. He commented that it would be very useful if a method could be found 
for efficiently summing the series for such values. And the transformation of theta 
functions accomplishes just this task. Moreover, in the same letter Euler mentioned 
Fermat’s remarkable theorem that every number could be expressed as a sum of three 


38 See Smith (1965). 
39 Jacobi (1969) vol. 1, p. 260. 
40 Fuss (1968) vol. 1, pp. 530-532. 
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triangular numbers, four squares, five pentagonal numbers, and so on. He remarked 
that the most natural way to prove this proposition might be to show that the coefficient 
of every power of x must be positive in the series: 


G4+xtx2+x9+.-..)3) dtxtxttx? +---)4, and so on. 


34.11 Cauchy: Theta Transformations and Gauss Sums 


Cauchy’s 1817 derivation of his transformation of the theta function depended upon 
the theorem now known as the Poisson summation formula.*! Cauchy was the first to 
discover this result, and he did so in the course of his work on the theory of waves. 
For the Poisson summation formula, consult Section 20.4. Independent of Fourier’s 
earlier work, Cauchy also discovered the reciprocity of the Fourier cosine transform, 
given by 


fa= [2 f° ereostear: P(x) =? f(t) cos tx dt. 
aX Jo JO 


He gave his summation formula in the form of the relation 


Va >> f (na) = V/B >> b(nB), 


where wf = 27, and the summation was taken over all integers. Cauchy obtained his 
transformation formula by setting f(x) equal to the function he called the reciprocal 
2 


2 
function, e7 ae and then setting @(x) = e7 'T in the summation formula. He then took 
a = /2a and 6 = J2b and stated the transformation as 


ai (5 es a? oe 4a? b2 9a? )=08 Ege? oh? ) (34.112) 


2 


ab=n. (34.113) 


Note that (34.112) describes the transformation of the theta function (or theta constant) 


[o,@) 
ey 
TIN” T 
a. 


n=—-@® 


under the mapping t > —i. 


Cauchy applied a very interesting idea to evaluate Gauss sums from (34.112). 
Taking n to be an integer, he set tT = 2 + fe in the transformation formula, and 
let a — 0. The asymptotic behavior of the two sides of the formula then yielded the 
result. Note that the theta function is analytic in the upper half plane and every point 
of the real line is a singular point. 


41 Cauchy (1882-1974) Ser. 1, vol. 1, pp. 5-318, especially pp. 300-303. 
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In 1840, Cauchy published his quadratic Gauss sum evaluation in Liouville’s 
Journal. He noted that (34.112) could be rewritten as 


a m2 
a(; { pre ete -)=va(5 } e a f Fae } ). 


Cauchy remarked that this step could be verified by the fact that the limit as a — 0 of 
the product 


was the integral 
| : e* dx = lly 
0 2 


With n a positive integer and a* = — 2m -1,0?= "> V—1, Cauchy could obtain 


ae ei Ae E252. 
eo kya" _ pKa’. 


; (34.114) 


Onl pg Om = pag ENT, (34.115) 


He then set a? = a? — an —1 and b? = p* + > —1 where a and # were infinitely 
small numbers and where 26 = na. The last condition was needed to satisfy the 
requirement ab = sz. After substituting these values of a? and b* in (34.112), he 
multiplied the equation by na = 26 and remarked that the result was 


ath =bi(lte FY), (34.116) 
where A was the Gauss sum 


2aV-1 2aV-1 2aV-1 
n 


Bega gms tie gh ee (34.117) 


A=1l1+e 


From (34.116) and (34.113) Cauchy completed his evaluation: 
1 


I 2 _ Am = 
A=—(i+e 3 ) 
a 


1 


= oa EW ae 2): (34.118) 
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To see why (34.116) holds true, we note that by (34.114), 


n CO 
2 2 2 Qik? = 209 9 
neo Be Bee 4 --)=na) e 7 ye ee (34.119) 
k=1 s=0 
Moreover, 
oo oo 
\0 2 a JI 
io ee a =) Paar fae re 
s=0 0 


and hence the expression (34.119) equals a And, by using (34.115), we can show 


that 


wpdte% +e pe 4...) = (te FV) 


Thus, Cauchy’s equation (34.116) is verified. 

In the first part of his 1859 Report on the Theory of Numbers, Smith noted that 
Cauchy’s method could be applied to derive the more general reciprocity relation for 
Gauss sums.** He set 


and took 


in (34.112) to find the reciprocity relation 


w(m,n) = ina + i)w(—n,4m). (34.120) 
4V m 


He also observed that from w(—4v,4m) = 4y(—v,m) and (34.120), it followed that 


wi(m,4v) = 2/2a + i)w(—v,m), 
m 


so that the case with even n would depend upon the case with odd n. Note that it was 
essentially this expression for the reciprocity of Gauss sums that Schaar obtained in 
1850 by using Fourier series. See Section 19.8. 

Henry John Stephen Smith, son of an Irish lawyer, studied at Oxford, where 
mathematics was not then popular. He independently read in detail the number 
theoretic work of Gauss, Dirichlet, Eisenstein, Jacobi, Kummer, and others; he became 


42 See Smith (1965b) p. 54, footnote. This page also contains several references to Cauchy’s papers on this 
topic. 
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the most outstanding British number theorist of the nineteenth century. An active 
member of the British Association for the Advancement of Science, he wrote his well- 
known report on number theory for the association. The report covered developments 
in number theory up to the 1850s. In spite of his important researches in number theory 
and elliptic functions, he worked alone without a following and was mostly ignored 
in his lifetime. Smith is most noted for his 1867 work was on the representation of 
numbers as sums of squares. Although it established Eisenstein’s unproved theorems 
on sums of five and seven squares, this important paper remained unnoticed. In fact, 
as late as 1883, the Paris Academy offered a prize for the proof of Eisenstein’s 
results. Fortunately, this brought Smith’s work to the notice of mathematicians and 
also succeeded in gaining some prominence for the 18-year-old Hermann Minkowski 
(1864-1909), who offered his own highly original paper on the topic. 


34.12 Eisenstein: Reciprocity Laws 


Even before his great 1847 paper laying the foundations for a new theory of elliptic 
functions, Eisenstein used Abel’s formulas to make some original applications of 
elliptic functions to number theory. In 1845, Eisenstein published “Application de 
Palgébre a l’arithmétique transcendante,’ in which he used circular and elliptic 
functions to prove the quadratic and biquadratic reciprocity laws.*? We review some 
of the then-known number theoretic results results upon which Eisenstein based his 
work: Let p be an odd prime. Following Gauss, divide the residues modulo p, namely 
1,2,...,p — 1, into two classes: rj,r2,... Ppa and —rj, —12,..., — = so that 


every residue falls into exactly one class. Eisenstein took one class to be 1,2, ..., poh 


2 
Note that then —1, — 2,..., — pot are identical (mod p) to p—1,p —2,..., ey 


A number a, prime to p, is called a quadratic residue modulo p if the Sauation 


x? =a(mod p) (34.121) 


has a solution; otherwise, a is a quadratic nonresidue. Eisenstein used a result of Euler 
now known as Euler’s criterion, proved by Euler in a paper read to the Berlin Academy 
in 1747. The result stated that a number a, prime to p, is a quadratic residue if and 
only if 


a > =1(mod p). (34.122) 


From Fermat’s theorem, a?~! = 1 (mod p), and hence ar = +1 (mod p). With 
this, Euler’s criterion can be proved: If a satisfies ae 120), then (34.122) follows by 
Fermat’s theorem. From the fact that there are ree + quadratic residues (mod p), 
it follows ee xr =1 ae p) has at least 7 * solutions. However, the equation of 


degree ?— * has at most 2 solutions. Hence ee comprise all the solutions. 


43 Eisenstein (1975) vol. 1, pp. 291-298. 
44 Eu. 1-2 pp. 62-85. E 134. 
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To state the law of quadratic reciprocity, we define the Legendre symbol (<) by 
the equation 


at = (=) (mod p). (34.123) 


Note that if a is a multiple of p, then we set () = 0. The law of quadratic reciprocity 
states that if p and q are odd primes, then 


(2) (4) = (1S. (34.124) 
q P 


Note that (34.124) is equivalent to the statement that if g is a quadratic residue 
(mod p), then p is a quadratic residue (mod q) except when both p and gq are of 
the form 4n + 3. In the latter case, g is a quadratic residue (mod p), if and only if p is 
not a quadratic residue (mod q). 

To begin his proof of the law of quadratic reciprocity,*> Eisenstein let r denote a 
number in 1,2,... es Then 


qr = tr'(mod p), (34.125) 
where r’ was also contained in 1,2,..., ae Eisenstein observed that since sine was 
an odd periodic function, 

Qn Qnr' 
i eine, (34.126) 
P 


Therefore (34.125) could be rewritten as 


2mqr 


sin 
=" (mod p). (34.127) 


qr=r 


Substituting the 7— * different values of r in (34.127) and multiplying, he obtained 


pei = sin 2224 
q? Tr=Mr' J] — (mod p). (34.128) 


k=1 “p- 


Eisenstein saw that Ir and Ilr’ were identical and concluded that 


ae n “at 
Il; nae (mod p). (34.129) 


45. ibid. p. 292. 
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Note that by Euler’s criterion, Eisenstein had found a trigonometric expression for the 


Legendre symbol (4). By reversing the roles of p and g, he obtained 


ql n =p 


2 > ¢§ 
= er ain (mod q). (34.130) 


At this juncture, Eisenstein employed Euler’s factorization, given in Section 15.5, 


outs = (DF Tl (sin? — sin om) 
P 


sin x 2p-1 


to conclude that the product on the right side of (34.130) equaled 


p-1 q-l 
Do 
271 20k 
cTII] (sin? om" — sin? =) (34.131) 
k=11=1 q P 
where 
p-| q-l1 
ad) ee ae 
C= GND 
Ose ane x 


For Euler’s factorization, see Sections 15.4 and 15.5. Next, by symmetry, Eisenstein 
had a similar product for (34.129) with the same constant C, but with factors of 
the form 


Thus, each factor in (34.131) was the negative of the corresponding factor in the 
product for the expression in (34.129) and the number of such factors was ee) a y .t- 7 
So Eisenstein could oe the product in (34.130) by multiplying the eroduete in 


(34.129) by (— ib ie . Therefore, employing Euler’s criterion, Eisenstein had the 


reciprocity law 
(2) (2) =(je 
q Pp 


Eisenstein gave a similar proof of the biquadratic (quartic) reciprocity law, but used 
the lemniscatic function instead of the sine function. Again, we consider the backdrop 
to his work. Even while he was working on the Disquisitiones Arithmeticae, Gauss 
started thinking about extending quadratic reciprocity to cubic and quartic residues. 
It appears that he very quickly realized that to state these reciprocity laws he had 
to extend the field of rational numbers by cube roots and fourth roots of unity. It is 
not clear when Gauss found the law of biquadratic or quartic reciprocity. On October 


300 Elliptic Functions: Nineteenth Century 


23, 1813, his mathematical diary noted,* “The foundation of the general theory of 
biquadratic residues which we have sought for with utmost effort for almost seven 
years but always unsuccessfully at last happily discovered the same day on which our 
son is born.” Strangely, in a letter of April 30, 1807, to Sophie Germain (1776-1831), 
Gauss had made a similar claim, challenging her to determine the cubic and quartic 
residue character of 2.47 Perhaps he discovered the theorem in 1807 and proved it in 
1813. In any case, Germain obtained some good results on this problem; she found 
the quartic character of —4. And Gauss wrote that, especially given the obstacles to 
women working in mathematics, he was very impressed with her accomplishments. 
However, Germain’s main contribution to number theory was in connection with 
Fermat’s last theorem; she discovered and applied the Germain primes p such that 
2p + 1 was also prime. 

Gauss published two papers on biquadratic reciprocity, in 1828 and 1832.** The 
first paper contained a thorough treatment of the biquadratic character of 2 with respect 
to a prime p = 4s + 1. Note that by Euler’s criterion, —1 is then a quadratic residue 
(mod p); further more, p can be expressed as a” +b*. Gauss denoted the two solutions 
of x* = —1 (mod p) by f and — f. He also took a to be odd and b to be even, and he 
took their signs such that a = 1 (mod 4) and b = af (mod p). His theorem stated that 
2 satisfied ofa =1, f—1, — f (mod p) where b was of the form 0, 1, 2, 3, (mod 4), 
respectively. 

In his second paper, Gauss proved that where m and n were integers, the ring 
Z{/—1] consisting of m + ni, was a unique factorization domain. In 1859, Smith 
commented on this result:*? “By thus introducing the conception of imaginary 
quantity into arithmetic, its domain, as Gauss observes, is indefinitely extended; nor is 
this extension an arbitrary addition to the science, but is essential to the comprehension 
of many phenomena presented by real integral numbers themselves.” It is clear from 
Gauss’s second paper that since primes of the form 4s+1, where s is a positive integer, 
can be expressed as a sum of two squares, a? + b* = (a + ib)(a — ib), they are not 
prime in the ring Z[i]. However, primes of the form 4s + 3 cannot be factored in Z[Z]. 
We see, therefore, that there are three classes of primes in Z[i]: (a) primes of the form 
ik(4s + 3); (b) primes of the form a + ib such that their norm, N(a + ib) = a? +b’, 
is a prime of the form 4s + 1 in Z; (c) the primes iX(1 + i), whose norm is 2. Let 
m = a-+ib bea prime such that a + b is an odd integer and N(m) = p. In this case, 
any number n, not a multiple of m in Z[i], leaves p — 1 possible residues when divided 
by m. For such m and n, the quartic symbol (4 )4 takes the values +1 and +i and is 
today defined by 


(), — ae (mod m). 


46 See Dunnington (2004) p. 484. 

47 Gauss (1863-1927) vol. 10, part 1, pp. 70-74. 
48° Gauss (1981) pp. 511-586. 

49 Smith (1965b) p. 71. 
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To prove quartic reciprocity, Eisenstein divided the p — 1 residues into four classes, 
with pot residues in each class, such that when r was in one class, ir, —r, —ir each 
fell into a different one of the other classes.°° He noted that for any n in Z[i] 


nr =r’, ir’, —r', —ir' (mod m), (34.132) 


where r’ was in the same class as r. 


He set 
: dx 
o=4| a 
0 JU — x4) 


Then by the periodicity of the lemniscatic function sl z, for m prime, Eisenstein had 
ln) 
sl(22) 


=1,i, —1, or —- 


corresponding to the four cases in (34.132). Hence in all cases, 
_si(2 
nr =r’ (mod m). 
sl(52) 


From this he got the formula analogous to (34.129), 


(ve 


=I; 7c (mod m). (34.133) 


To obtain a formula with m and n interchanged, Eisenstein chose n to be another 
complex prime c + id with c + d odd and norm q. He divided the residues of 
nonmultiples of n into four classes represented by p, io, — p, —ip and concluded that 


a-1 sl("0°) 
mit= I] We (mod n). (34.134) 


p 


Gauss defined the concept of a primary number so that he could express his results 
in unambiguous form. A number c + id, where c + d was odd, was called primary 
if d was even and c + d — 1 was evenly even (that is, divisible by 4). This definition 
was adopted by Eisenstein. We remark by the way that Gauss also suggested a slightly 
different definition of a primary number, useful in some circumstances; this definition 
was employed by Dirichlet. It is easy to show and Gauss of course knew that c + id, 
with c + d odd, was primary if and only if c+ id =1 (mod 2 + 22). It follows that the 
product of primary numbers is primary and that the conjugate of a primary number is 
primary. In his work on the division of the lemniscate, Abel showed that for a primary 
number m 


50 Eisenstein (1975) vol. 1, pp. 294-297. 
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He also proved that 


slmv) _ $(x*) 
slu w(x4)’ 


(34.135) 


where x =slv and where ¢$(x) and w(x) were polynomials of degree Pl See 
equation (34.45). Eisenstein improved on this by proving that w(x) in (34.135) 
satisfied 


W(x) =ix't o (=) (34,136) 


for some integer v; he also showed that when m was primary, v = 0. To prove 
(34.136), he noted that y = gis satisfied the differential equation 


dy  _ mdx 
Ji-y vVi-x4 


(34.137) 


He set y =1,x= a where jz was an integer yet to be determined. This change of 
variables converted (34.137) to 


i“ dn = mdé 
Jat 1) f/(E4 = 1). 


Eisenstein took jz such that (34.138) would be equivalent to 


(34.138) 


dyn mdé 


Jd-7n) Ja 6&4) 


and concluded 
Eva) 
Oa) 


= jt 


this immediately implied (34.136). Thus, by (34.135) and (34.136) he obtained 


(x4) 


sl(mv) = “xP lo ky 


(34.139) 


Eisenstein next set v = ve so that sl v = 1 and for primary m, sl(mv) = 1. Thus, for 


m primary in (34.136), he had 1 = i~”. He then assumed that n was also a primary 
prime so that 


slnv) _ fF @*) 


slu xa-l f(4) 


34.13 Liouville’s Theory of Elliptic Functions 303 


where f(x) was a polynomial of degree _ He set 


a=si(—), p=s(), 


so that the solutions of ¢(x*) = 0 were of the form +a, + ia and those of f(x4) =0 
were of the form +6, + if. Thus, he arrived at 


sl(mv) _ TH(x* — a*) sl(nv) _ TI(x* — B*) 
slu = 11 — a4x4)’ sl(v)  1(1 — B4x4)" 


When he combined these formulas with (34.133) and (34.134), he obtained 


e-1 _ Ti@t— p4) 
n4 = 


= Tid — Bat) (mod m), 
et _ T(B*—a*) 
= Td ~ 4p (mod n). 


Eisenstein observed that since there were pot : qt factors in the products, the funda- 
mental theorem on biquadratic residues, or quartic reciprocity, followed immediately. 

Eisenstein studied the polynomial @(x) in even greater detail later in his 1845 
paper! “Beitrige zur Theorie der elliptischen Functionen, I.” For primary m, he 


proved that 
(x) = xP! AyxP 5 4. 4m, 


and showed that all the coefficients Ai, A2, ...,m were divisible by m. Then, in 1850, 
he published a paper using a generalization of what we now call Eisenstein’s criterion 
to prove the irreducibility of @(x). Suppose f(x) = agx” + ayx""! +--+» + ap, 
where a; € Z[i]. Also suppose m is a prime in Z[i] such that m divides ay, ...,dn, 
but does not divide ao, and m? does not divide a,. Then f (x) is irreducible over Z[i]. 
Eisenstein included a statement and proof of this theorem in an 1847 letter to Gauss. 
But in 1846, Theodor Schénemann, a student of Jacobi and of the Swiss geometer 
Jakob Steiner, published this theorem for the case where Z[i] was replaced by Z,; this 
particular case is now known as Eisenstein’s criterion or the SchGnemann-Eisenstein 
criterion. Eisenstein acknowledged this work in his 1850 paper.>7 


34.13 Liouville’s Theory of Elliptic Functions 


Liouville’s contributions to this topic are mainly contained in his lectures, published 
by Borchardt in 1880.°> However, as we mentioned earlier, Liouville began to grapple 
with elliptic functions as early as the 1840s. We briefly discuss his early thoughts, 


5! Bisenstein (1975) vol. 1, pp. 299-324. 
52 See Lemmermeyer (2000) p. 254. 
53 Liouville (1880). 
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contained in his numerous notebooks.>* When Hermite remarked to Liouville that 
one could use Fourier series to prove Jacobi’s theorem on the ratio of two independent 
periods of a function, Liouville was apparently motivated to prove this and wrote it up 
in his notebook on August 1, 1844. A few pages later he included a more direct proof, 
supposing that f had real periods a and a’, independent over the rationals. Then, using 
the fact that a was a period, he noted that f had a Fourier expansion 


f(x) = )° Aj cos (A= +4). 


Since a’ was also a period, he had 


Qjmx 2Qjmx 2 jaa’ 
Aj cos | ——_ +€; } = Aj; cos te; 4 : 
a a a 


Thus, Liouville concluded that either A; = 0 or 2) aot = 2mm, where m was an 
integer. The last equation implied that a and a’ were commeasurable, or dependent 
over the rationals. 

These ideas soon led to the statement and proof of the theorem now famous as 
Liouville’s theorem, that a bounded entire function is a constant. He first proved this 
for doubly-periodic functions, using Fourier series. He then extended it to functions 
bounded on the Riemann sphere. Assuming the result for periodic functions, he proved 
the extension by taking an analytic function f(z) and assuming | f(z)| < M for all z. 
Then the function f(snz), f composed with the Jacobi elliptic function, would be 
a doubly-periodic bounded function and hence a constant. Liouville noted that an 
application of this theorem was that every algebraic equation had to have a root. 
He argued that if p(x) was a polynomial and _ did not become infinite for any 


complex x, then the same would be true for ca and this was a contradiction. It is 
interesting that though Liouville never published this application, it is usually the first 
one to be given in textbooks. 

Liouville also proved that an elliptic function could not have only one simple pole. 
He noted that, on the other hand, if there were two simple poles, then the function 
would reduce to the usual elliptic function. In this connection, Liouville showed in his 
notebooks that if @ had two simple poles, a and £, then there would be a constant D 


such that 


u = (¢(a +x) — D)+ (¢(a—x)—D) 


was a Solution to 


54 See Liitzen (1990) chapter 13. 
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This meant that wu was the inverse of the elliptic integral: 


/ du 
x= ‘: 
Va+ bu + cu? + du + eut 


The following theorems of Liouville on elliptic functions commonly appear in 
modern treatments of the topic: 


¢ The number of poles equals the number of zeros, counting multiplicity. 

¢ The sum of the zeros minus the sum of the poles (in a period parallelogram) is a 
period of the function. 

¢ The sum of the residues is equal to zero. Liouville proved this for functions with 
only two poles. 

¢ A doubly-periodic function with only one simple pole does not exist. 


Liouville wrote that within his new approach, “integrals which have given rise to the 
elliptic functions and even moduli disappear in a way, leaving only the periods and the 
points for which the functions become zero or infinite.” This important new principle, 
that a function may be largely defined by its singularities, was greatly extended by 
Riemann in his remarkable works on functions of a complex variable. 

We present Liouville’s proofs based on Borchardt’s notes. Liouville considered 
a doubly-periodic function (z) with periods 2@ and 2w’ so that its values were 
completely defined by its values in the region 


z=ztuwmtu'a’, -l<u<l1, -1l<u <1. 


We would now refer to this region as the period parallelogram P,,. Liouville than 
assumed that z = a, Z = @1, Z = Q2,..., Z = My,_ 1 were the n roots of the equation 


(z) = on in this region. Then there would exist constants G,G1,...,Gn—1 so that 
G G Gn 
(2) ie eh ee (34.140) 
Z-A ZA, ZA Z— An-1 
was finite at a@1,@2,...,Q,. In the case where there were multiple roots, so that the 
(say 7) values 
Qp,Qg,-.-,Q@s coincided, the sum of simple fractions 
Gp | Gg lod Gs 
Z—Ap Z — Ag Z— As 
had to be replaced by 
Gp Gy Gs 


T 
Z—Ap (Zz — ap)? 


tina 34.141 
CES: ( ) 


Liouville designated the sum of the fractions as the fractional part of #(z) and denoted 
it by [@(z)]. He noted that this fractional part played an important role in the calculus 
of residues, and he showed that a doubly-periodic function without a fractional part 
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was a constant. Liouville did not refer to poles, but we would now say that a doubly- 
periodic function must have poles. Note that in (34.140) all the poles are simple and 
in (34.141), a, is a pole of order i. 

Liouville next proved that there could be no doubly-periodic function with a 
fractional part, [¢(z)] = =. In other words, there did not exist a doubly-periodic 
function with just one simple pole in the period parallelogram. To prove this, Liouville 
set z — a = ¢ so that the fractional part at w would be given by 


G 
[o@)I" =le@ +O = — 


Similarly, 


G 
[p(a —1)]° = aS 
so that 
[p(a +t) + O(a —1)]° =0, 


and therefore @(a@ + t) + d(a@ — t) = 2c, where c was a constant. Liouville then set 
f@® =¢(a@+t) —c, to get f(t) = —f (2). Since 2 and 2a’ were periods of f, he 
obtained 


f(@) = —f(-—@) = —f(—@ +20) = —f(o) =0, (34.142) 
fi) =0, fl@t+a’)=0. (34.143) 


He then defined a new function F(t) = f(t) f(t + o), noting that this function had 
no singularities; the zeros cancelled with the poles, based on (34.142) and (34.143). 
This implied that there were constants k, k’, and k” such that 


fOft+o=k fOftto)=k, Ff fE+oto’) =k". (34.144) 
Liouville changed ¢ to tf + w in the third equation, obtaining 
ferofe+o)=sk". 
Finally, multiplying the first two equations and dividing by the fourth he arrived at 


2 kk’ 
oO) Beer 
This implied that ¢(z) was a constant and the result was proved. 
Liouville then gave a simple construction of a doubly-periodic function with 
periods 2m and 2a’ and poles at a and f. He set 


eo) = D> f(e+2io’), 


i=—0o 
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where 


i d =h+h'’, pB=h—-N’. 
F@) cos (z —h) — cos =h’ oy m Tee 


Liouville next analyzed the zeros of an elliptic function. He observed that a doubly- 
periodic function #(z), with poles at a and 6, could not have only one simple zero 
because its reciprocal would have one simple pole, an impossibility. He then showed 
that (z) could not have three zeros. Supposing a and b to be two of the zeros, he 
took another function y(z) with periods 2w and 2w’ and poles at a and b. He also 
set W1(z) = w(z) — w(a). Clearly, wi (z) @(z) had only one pole at 8, implying that 
wWi(z) ¢(z) = constant. Now if ¢ had another zero at c, then y% would have a pole 
at c. This contradiction proved that ¢(z) had zeros only at a and b. Liouville also 
proved that if two functions #(z) and ¢;(z) had the same periods with simple poles at 
a and £, then there existed constants c, c’, such that 1 (z) = c@(z) +c’. To prove this 
he set 


G H G H 
[6@] =——- + —,, [¢1@] = ——- + —,. (34.145) 
z—-a z-—B Ca 8 
so that 
GH, —G\|H 
[G¢1(z) — Gig(z)] = ————_. 
Gama 2 
Hence, the result: 
G¢,(z) — G1¢(z) = constant. (34.146) 


Liouville proceded to prove that for a doubly-periodic function @, the sum of the 
zeros was equal to the sum of the poles, modulo some period of the function. He 
assumed that ¢ had poles at a and 6, so that ¢(a@ + 6 — z) also had poles at a and B. 
Hence, by the previous result, 


doz) =co(a+B-zt+e’. 
Replacing z by a + B — z, he got the relation 
oa+B—z)=coz) te, 
and by subtraction 
(1 +c) (@@) —o(@+ B—z)) =0. 
Liouville noted that if c = —1, then 


d(2)+oa+B-z=c. 
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To prove this impossible, he set z = ath +t,and @ (4 + r) — 5c! = f(t), so 


that f(t) = —f(—t). From this and seeing that 2a and 2w’ were periods of f(t), it 
followed that 


£0) = fe) = f@) = f@+o) =0. 
Since f could not have four roots, he obtained the required contradiction and therefore 
b(z) = b(a + B — 2). 


By taking the reciprocal of ¢, Liouville saw that ¢(z) = ¢(a+b-—z). He thus arrived 
at the relation 


OZ) =oa+B-z)=o¢(a+B-—a—b+z) 


and concluded that 


a+ B=at+b+2mo+2m'a’. 


Liouville went on to show that any doubly-periodic function could be written in 
terms of functions of the form @. He presented the details for functions with simple 
poles. Suppose y is a function with periods 2m and 2a’ and 


A A A 
Iviol= need ee ec ee 
Za 


fa] g— a2 


Denote by $(z;a@,a 1) the function with the same periods as y and with simple poles 
at @ and a and let 


G| G| 
[b(z;a,01)] = a : 
Za Z—Q] 

G2 G2 


[b(z; a1,a@2)] = —— — : 
Z-Q zZ—-a 


G3 G3 
— — andsoon. 
Z— a2 Z— 3 


[b(z; a2,03)] = 
Then 


[y(z)] = By [O(z;a,01)] + Bo [O(Z; @1,02)] + B3 [6 (z; a2,a1)] +--: 


A A+A A+A A 
with Bj} =—, Bo= a a B3 = Epa) etc 
G, G2 G3 


Thus, 


W(z) = B+ Bidb(z;a,01) + BoP(z; 0,02) + B3b(z; 0,03) +++. 
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This theorem was then employed to prove that any doubly-periodic function had 
exactly as many zeros as poles. Here let #(z;a@, 8; a,b) denote a function with periods 
2w, 2w’; poles at a, B; zeros at a, b, with a + B = a + b. Suppose also that y(z) is 


doubly-periodic with poles at z = a,a@1,...,@n—1 and zeros at Z = a,d1,d2,...,dj-1. 
If i < n, Liouville arbitrarily chose n — i — 1 numbers aj,aj+1,...,d,—2 and 
determined b, bi, ...,bn—2 by the system of equations 


b=a+a,—a, 


bj =a2+b-a1, 


bz = a3 + bj — az, 


by—2 = An—1 + bn—3 — An-2. 


He next defined w(z) as 


(z;.a,01;a,b) - b(z; a2, b; ay, b1) « 6(z5.a3, 01; a2, b2) + O(Z3 An—1, bn—33 Qn—2, bn—2) 


and noted that w(z) had poles at 
A,01,02,...,Ayn—] 
and zeros at 
d,aj,a2,...,An—2, bn—2. 


If i < n, then the function ae had no poles but had zeros at a;, ...,@n—2, bn—2, 


an impossibility. Similarly, for i > n, he took the function ao to get a similar 
contradiction. Thus, i = n, Ww(z) = cw(z), and w(z) had as many zeros as 
poles. Also, since yw had zeros at z = d,a,a2,...,@,—, and w(z) had zeros at 
Z = 4,d1,...,An—2,by—2, he could conclude that b,_2 = a,_,. Liouville substituted 


these values in his system of equations to arrive at 


This implied that the sum of the zeros differed from the sum of the poles by 
2mw@+2m’'o', where m and m’ were integers. In the applications, Liouville derived the 
differential equation and addition formula satisfied by a function @ with simple poles 
at a and f. He also explained how to obtain the Abel and Jacobi elliptic functions 
from his general results. 


34.14 Hermite’s Theory of Elliptic Functions 


Charles Hermite had a life-long interest in the theory of elliptic functions. In fact, 
we can credit Hermite with the complex analytic proofs of the basic results that a 
doubly-periodic function must have poles and that the sum of the residues in a period 


310 Elliptic Functions: Nineteenth Century 


parallelogram must be zero. However, even before he gave these proofs in a lost paper 
of 1849, of which a report by Cauchy exists,-> Hermite published two papers on 
elliptic functions. In the second of these, given in 1848 in the Cambridge and Dublin 
Mathematical Journal,>® Hermite took the ratio of functions with the same period and 
then found sufficient conditions for this ratio to be doubly-periodic. 

In this 1848 paper, he wrote that he had been motivated to develop his approach 
to elliptic functions by Arthur Cayley’s 1845 paper?’ in which Cayley had started 
with a double infinite product to establish the properties of elliptic functions. Hermite, 
by contrast, began with the ratio of two periodic functions and he explained later*® 
why he did this: He observed that Liouville’s theorem amounted to the result that if a 
periodic entire function 


f= YD Aner 


m>=—-W% 


with period w; had another fundamental period w2, then f(x) was a constant. Thus, 
he found it natural to research the fraction 


PO os Ave Ot 
fojoe2™ <r (34.147) 
fy ag Oe OL 


Clearly, f(x + @1) = f(x). In order to establish sufficient conditions for double 
periodicity, Hermite supposed that 


f(x +2) = f@), (34.148) 


where Im i > 0. Equation (34.148) led him tp propose to determine the conditions 
on A» and By, such that 


inx in(x+wy) Se inx in(x+wy) 
- Ayer Bi \- Bee Di 4 De B ee or S- Aye =e 
m>=—CO m>=—-Ch n=—-C n=—-C& 
(34.149) 


imx 


: elke’ . : : Quit 
With yu taken to be an arbitrarily chosen integer, equating coefficients of e 1 on 


w 


each side of (34.149), and setting gq; = ef, he arrived at 


[o,@) CO 
y Ana = Apa Ba (34.150) 


m>=—-CO n=—-& 


55 Hermite (1905-1917) vol. 1, pp. 75-83. 

56 ibid. pp. 71-73. See also the treatment in Tannery and Molk (1972) vol. 2, pp. 152-158. 
57 Cayley (1889-1898) vol. 1, pp. 136-155. 

58 Hermite (1905-1917) vol. 2, pp. 143-148. 
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Hermite observed that one way of satisfying equation (34.150) was to make it an 
identity, so that each term A,,—m Bin Ge was equal to Ay,» By qi ~" Moreover, for 
all values of ~ he had 
Wu- 
Ap Buge = Aga Big, (34.151) 


Taking n = m + k, where k was an integer, he wrote equation (34.151) as 


B Ay— _ 
m rb = ATH (m+k) qi (nth) (34.152) 
Byn+k Ay—m 


He next noted that since jz was arbitrary, he could set 4. — (m +k) = m', where m’ 
was independent of m, so that (34.152) became 


ars (34.153) 


B, 
= a = const., 


Am+k Bn+k 
and A, and B,, were solutions of the same difference equation 


2m+a 
Sm+k = q1 Zm> 


with w taking account of the constant; the general solution of this difference equation 
was, aS Hermite noted, 


"teem 


Am = 4, dm (34.154) 
and 
m2 +a4—m 
Bn = qe er bm (34.155) 


could be substituted in (34.147). However, by taking x to be x + 6 for a suitable 
constant, Hermite could write (34.154) and (34.155) as 


2 2 


m m 


Am = Gi Gms and By = qi" bm 
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with 


Am+k = 4m, and Dink => bin. 


Denoting the numerator and denominator of (34.147) by ®(x) and I(x) respec- 


tively, he had 


(oe) , 
imtx 


m2 
@(x)= So amg,’ 


m>=—-CO 
and 
imx 


ee m2 
H@)= > bagh em, 


m>=—-CO 


(34.156) 


(34.157) 


k4£ 2x+@) 


functions having period @ , but with respect to w2, producing a factor e “1 


For example, 


SLi = © (x)e tar Bt) 


(34.158) 


and similarly for I(x + w2). To prove (34.158), Hermite noted that changing m to 


m — k in the sum (34.156) left the sum unchanged. Thus 


ee m2 i 
=-+2m 2m'2Z* 
O(x +a7) = oe am ay" Pela 
m=—CO 
_ es r ao ky? +2(m—k) eum ky im 
TS m 
m>=—Oo 
imx m iEx 
= as ek ay Sy An fee Moy 
m>=—OoO 
= O(x)e bay xte2) 


Note that since a4 = Gm, the series for ®(x) and II(x) can each be broken up 


into k separate series. Thus 


(kei 


P(x) = ye 2 q ee EE 


m>=—-CO 


Hermite considered the particular case k = 2 with 


wo, =4K, ow. =2iK’, qe, g=e 


(34.159) 


34.14 Hermite’s Theory of Elliptic Functions 313 


In this case, (34.159) simplified to 


oo) oe) 
2 imax (m+1)2 ix 
P(x) = ao ) que" K +a, ) q 4 e(2m+l) aR 


m>=—CO m>=—-C%O 


= ao (1 +24 00s2( ) +244 cos ) + 20° cos9( 2 at) ++) 
+a, (244 co (Fe) +208 cos 3( 7 z) +208 cos5( 7 z) +). 


He denoted the coefficient of ag by ©;(x) and the coefficient of a; by H\(x) so 
that he had 


2Kx 4 9 
©; | —— ] = 1+2q cos2x +2q" cos4x + 2q” cos6x+---, (34.160) 
ua 
2Kx 1 9 25 
Ay | —— ] =1+2q4 cosx +2q4 cos3x +2q4 cossx+---. (34.161) 
us 


Hermite’s functions ©; and H; are very closely related to Jacobi’s two basic 
theta functions as defined in his paper of 1828, “Suite des notices sur les fonctions 


elliptiques:”*? 
@(x) = 1 — 2q cos 2x + 2g4 cos 4x — 2g? cos 6x ++--, (34.162) 
H(x) = 294 sinx —2q4 sin3x +2q7 sindx —---. (34.163) 


Jacobi noted some properties of his functions and also observed that his elliptic 
functions sn, cn, and dn could be expressed in terms of © and H: 


2Kx 1 A(x) 


sn = : 
x vk O@) 
2Kx — [k H(x+4) 
cn = ’ 
x k OQ) 
2Kx H(x) 
dp aa 
no O@ +5) 


In this way, Hermite’s theory of doubly-periodic functions succeeded in repro- 
ducing Jacobi’s theory of elliptic functions as the ratio of theta functions. See also 
Exercise 7 of this chapter. 

Hermite’s 1849 paper, communicated to the Académie des Sciences (Paris), showed 
how Cauchy’s calculus of residues could be applied to derive Liouville’s basic results 
on doubly-periodic functions. All that remains of this paper is Cauchy’s account,” 
according to which Hermite proved that a doubly-periodic meromorphic function f 


59 Jacobi (1969) vol. I, p. 256. 
60 Hermite (1905-1917) vol. I, pp. 75-83. 
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must have more than one simple pole in a period parallelogram. His method was to 
integrate the function along the boundary of the parallelogram. A period parallelogram 
is defined as the set of points inside the parallelogram ABC D, where A, B, C, D can 
be represented by the complex numbers ¢, ¢ + 1, € + @1 + 2, € + w2 respectively; 
@ and w2 are the fundamental periods of f. Thus, if we denote the boundary of the 
parallelogram ABCD by L, then Hermite showed that 


i f(zdz = 0. (34.164) 
L 


Note that Z can be chosen so that it does not contain any singularities of f. 
Hermite’s argument is identical to the one found in modern text books. He observed 


that 
B C D A 
[ toa | +f +/ +/ f (2)dz. (34.165) 
L A B C D 


Since w2 was a period of f(z), he had 


D A B 
: f(zjdz= / f(+a@2)dz = -{ f@dz; (34.166) 
C B A 


because w, was also a period, he could write 


A B fe 
i, f@dz= i} f(zt+taj,)dz = -| f (2dz. (34.167) 
D Cc B 


On subsituting (34.166) and (34.167) in (34.165), Hermite had a proof of (34.164). 
In their 1859 treatise on elliptic functions, Briot and Bouquet applied Hermite’s idea 
of complex integration to prove basic theorems in elliptic functions theory.°! 


34.15 Exercises 


(1) If ¢ is the cosine transform of f, then 
Jal f(o) — f Ba) — f (Sa) + f (Jor) + f (9a) —--+) 
= /B(¢(B) — 68) — 658) +---), 
where of = 4; and for af = % 
JVa( f(a) — f (Sa) — f(a) + f(a) + f (13a) —---) 
= /B(o(@) — (5a) — (7a) +--+), 


6! Briot and Bouquet (1859), especially pp. 79-81. 
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where the integers 1, 5, 7, 11, 13,... are prime to 6. See Ramanujan (2000) 
p. 63. 


(2) Let 4 denote the complete lemniscatic integral i ot Show that then 
—x 


x 3x0 5m 

(0) e2 e2 e2 

An = eF —1 8% —-1 eH] , 
514 3 5 

w e2 3e2 Se2 


4n2 e+] 8% 4] OT4]4 


See Abel (1965) vol. 1, p. 351. 
(3) For f(a) and F(q@) defined by (34.11) and (34.12), and w by (34.7), show that 


3x 3x 5a 5x 
2 @ e2—e 2 e2 —e 2 


aw) _ 4m ( cost) —_cos(34#) __cos(S4) 
(B)=F(, .) 


ELE a 
2—e 2 


3x 3x 5a Sx 


ez —e 2 ez —e 2 


pe (Se eG. 
w et —e7-2 


(4) Consider the equation 


dy /—ndx 


Vd —y)0— ey?) fA — x2) = 2x?) 


Show that if n = 3, then e satisfies the equation e* — 2./3e = |: andifn =5, 
then e satisfies e? — 1 — (5 +2/5)e(e — 1) = 0. Kronecker called these values 
of e singular moduli. See Abel (1965) vol. 1, pp. 379-384. 


(5) Show that if the equation 


wm 


dy _ adx 
Ja = ey) (= ey?) J = cx) — 2x) 


admits of an algebraic solution in x and y, then a is necessarily of the form 
ww’ + ./— when yw and py’ are rational and yw is positive. For such values 
of a, the moduli e and c can be expressed in radicals. See Abel (1965) vol. 1, 
pp. 425-428. Kronecker made a deep study, related to algebraic number theory, 
of complex multiplication. For a historical discussion of this topic, see Vladut 
(1991). Also see Takase (1994). 


iofy 2S — 4af. 
(6) Show that the functions 6 and 6; below satisfy a = We: 


O(x) = 1 — 2e~@ cos 2x + 2e~*” cos 4x — 2e~*’ cos6x +++, 


Dh Oe a Sing E> @ Soe Oe" aw SSP eke 


See Jacobi (1969) vol. 1, p. 259. 
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/ 


(7) Jacobi defined the theta functions (with gue *): 


2K 
o( *) = 1 — 2q¢ cos 2x + 2q*cos 4x — 2g? cos6x +--+, 
1 


2K 
H (=) = 24 sinx —2q# sin3x +2q 4 sinSx —---. 
IU 


Show that for u = =An , we have 


@(u+2K) = O(—u) = O(u), Hu+2K) = H(-u) = —A(u); 


O(u+2iK’) =— 
HW Vk'H(u + K) 
See CS 

(VkO(u)) (VkO(u)) 


See Jacobi (1969) vol. 1, pp. 224-231. 


(8) Let w = 2mm + 2nw’, where m and n are integers and let tT = oe with 
Imt # 0. Note that o’ denotes the derivative of o. Define the Weierstrass 


sigma function by 


ceo = aT, (t= 2) 


where the product is taken over all m and n except m = n = 0. Show that 


1 — 2h?" cos2 hn 
o(u) = em @ sin ux I] ( (i jae = ) ‘ 


where v = Hh=e™, n= 7 ee >- esc? ntz. Show also that 


/ 
o(u +20) = —e*20U+)g (y), 1 = 2 i) 
o(w) 
/ / 
o(u +200") = #2 UW g(y), yf = 2 
o(a’) 
Prove Legendre’s relation no’ — wi! = B, when Imt > 0. See Schwarz 


(1893) pp. 5—9; note that these are Schwarz’s notes of Weierstrass’s lectures. 


(9) Set (uw) = —* logo(u). Show that g'(w) = g'(o') = g'(w +o!) = 0. Set 
9(@) = €1, P(@+ a’) = e2, £2 (w') = e3 and show that 


(9! (u))* = 4( (u) — e1)( (u) — €2)(9 (u) — €3). 
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Also prove that 


ou +v)o(u — v) 


pu) — P(v) = saz 


9! (u) Few) 


(ut v) = —p(u) — pv) (q@ree 


The last result is the addition formula for Weierstrass’s so-function. See 
Schwarz (1893) pp. 10-14. 
(10) Let k? = so—*3. Set 


e 


oy OOre in = oon uO (® +u) 
a(w) ao (w’) 


o\(u) =e 


Prove that 


ou) _ sn(fei — 63 -u,k) 
o3(u) Jel — €3 


and 


01 (u) _ cn(,/e, — €3-u,k) 
o3(U) Wei 23 


See Schwarz (1893) pp. 30-35. 


34.16 Notes on the Literature 


Abel (2007) contains an English translation of Abel’s papers on analysis. A 150-page 
summary of Abel’s mathematical work is given in C. Houzel’s article in Laudal and 
Piene (2002). Consult Prasad (1933) for an interesting account of the work of Abel 
and Jacobi in elliptic functions. Ramanujan was also a master of the theory of elliptic 
functions. For his work, see Venkatachaliengar and Cooper (2011). For Ramanujan’s 
prolific work on modular equations, see Berndt’s helpful summary in Andrews, Askey, 
Berndt et al. (1988). 

As pointed out in the text, the Eisenstein criterion given in algebra textbooks is due 
to Sch6nemann. Refer to Lemmermeyer (2000) for its history. A historical account of 
this criterion is also available in Cox (2004). The entertaining book by Dorrie (1965), 
page 19, attributes the criterion to Sch6nemann alone. 
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Irrational and Transcendental Numbers 


35.1 Preliminary Remarks 


The ancient Greek mathematicians were aware of the existence of irrational numbers; 
Eudoxus gave his theory of proportions to deal with that awkward situation. The 
Greeks also considered the problem of constructing a square with area equal to 
that of a circle. Later generations of mathematicians probably began to suspect that 
this was not possible; they were possibly almost certain that mz was not rational. 
The sixteenth-century Indian mathematician and astronomer, Nilakantha, wrote in 
his Aryabhatyabhasya, “If the diameter, measured using some unit of measure, were 
commensurable with that unit, then the circumference would not allow itself to be 
measured by means of the same unit so likewise in the case where the circumference 
is measurable by some unit, then the diameter cannot be measured using the same 
unit.”! He gave no indication of a proof in any of his works. It appears that the first 
proof of the irrationality of a was presented to the Berlin Academy by the Swiss 
mathematician J. H. Lambert (1728-1777) in a 1768 paper.” He demonstrated that 
if x ~ O was a rational number, then tan x was irrational. He deduced this from the 
continued fraction expansion 


tan v= 


Then, since tan 7 = 1, it followed that z was irrational. Lambert’s work was based 
on some results of Euler, who was a colleague of Lambert for about two years at the 
Berlin Academy. Later, in his 1794 book on geometry, Legendre gave a completely 
rigorous and concise presentation of Lambert’s proof.* In particular, he showed that 
the continued fraction 


! Yushkevich (1964) p. 169. 
2 Lambert (1768). 
3 Legendre (1894) Note VI, pp. 296-304. 
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12 
n3 +e. 

where m;,n; were nonzero integers, converged to an irrational number when a <1 

for all i beyond some ig. Legendre went a little re than Lambert by observing that 

the continued fraction for tan x also implied that 7 was irrational. 

A century before Lambert, James Gregory tried to prove that 2 was transcendental. 
Since he was starting from scratch, it is not surprising that Gregory failed. C. Goldbach 
and D. Bernoulli carried on a correspondence in the 1720s, in which they mentioned 
that the series they had discovered could not represent rational numbers or even roots 
of rational numbers. Thus, in a letter to Goldbach of April 28, 1729,* Bernoulli 
commented, concerning the series 


m+n m n2 n> n4* 


lo — +++, (m,n positive integers, m <n), 
8 n n  2m2"' 2m3 4m4 ( P & ) 


“Moreover, when summed, these are expressible neither in rational numbers, nor even 
in radicals or irrational numbers.” Unfortunately, he had to admit to Goldbach that he 
had no proof; a proof of the transcendence of this number follows from a theorem 
proved by Ferdinand Lindemann in 1882.> In reply to Bernoulli, Goldbach remarked® 
that it was not known whether, in general, with f rational, the number 


eae 
2 
48 + fn 


could be expressed as a root of a rational number. Note, for example, that when f = 2, 
the sum is 3, while with f = - the sum is 4log(5). Goldbach was probably aware 
of the first result; though he may not have noticed it, he could have derived the second 
result from Brouncker’s series for log 2. Observe that the second result can also be 
derived from the Euler—Maclaurin summation formula or from Mengoli’s inequalities, 
given in Exercise 4 of Chapter 20. In a later letter of October 20, Goldbach wrote,’ 
“Here follows a series of fractions, such as you requested whose sum is neither rational 
nor the root of any rational number: 
1 1 1 1 


! ! ! + ete. 
10 100 ' 10000 . 100000000 ' ~~ 


4 Fuss (1968) vol. 2, pp. 298-304, especially p. 301. 

5 Lindemann (1882). 

6 Fuss (1968) vol. 2, pp. 312-315, especially p. 313. 

7 ibid. pp. 326-327. Translation from Liitzen (1990) p. 514. 
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The general term is 


1 ” 


Neither Goldbach nor Bernoulli could suggest any method for attacking these 
problems. Kurt Mahler (1903-1988), FRS, was a largely self-taught German math- 
ematician who had to leave Germany and had a long career in England, Australia, and 
the United States. He learned Chinese and encouraged other mathematicians to read 
Chinese mathematics. In 1926, he proved a theorem more general than required but as 
a consequence of which Goldbach’s number was necessarily transcendental.® In 1938, 
Rodion Kuzmin (1891-1949) also gave a proof of the transcendence of Goldbach’s 
number.” 

In his 1748 book, Introductio in Analysin Infinitorum, Euler made some insightful 
remarks on the values of the logarithm function,!° “Since the logarithms of numbers 
which are not powers of the base are neither rational nor irrational, it is with justice 
that they are called transcendental quantities.” He did not clearly define his meaning 
of irrational, but from his examples we gather that he meant numbers expressible by 
radicals. A clear definition of a transcendental number was given by Legendre in a 
note in his 1794 book: “It is probable that the number z is not even comprised among 
algebraic irrationals, that is, it cannot be the root of an algebraic equation of a finite 
number of terms whose coefficients are rational, but it seems very difficult to prove 
this proposition rigorously.”!! 

The first mathematician to rigorously prove the existence of transcendental numbers 
was Liouville. In 1840, he published two notes showing that e and e* could not 
be solutions of a quadratic equation. The 1843 publication by P. H. Fuss of the 
Euler, Goldbach, and Bernoulli correspondence further aroused Liouville’s interest in 
transcendental numbers. He read a note on continued fractions to the French Academy 
in 1844. Given that a continued fraction was the root of an algebraic equation with 
integral coefficients (in modern terminology, an algebraic number), he stated the 
condition that the terms of such a continued fraction had to satisfy. In a subsequent 
paper presented to The Academy and published in the Comptes Rendus,'* he presented 
his famous criterion for a number to be algebraic of degree n: If x was such a number, 
then there existed an A > 0 such that for all rational 


P 
ore 


See Mahler (1982) p. 182. Also see the Kurt Mahler online archive for more information on his outstanding 
career. 
9 Kuzmin (1938). 
10 Buler (1988) p. 80. 
!l Legendre (1894) Note VI, pp. 303-304. Translation from Liitzen (1990) p. 516. 
12 Liouville (1844). 
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He noted an almost immediate consequence of this: 


1 1 1 1 


] 12) ! ied TT 7@) ber 


was a transcendental number, for / > 1 any integer. 

It is clear that Liouville was attempting to prove the transcendence of e and it must 
have pleased him that his younger friend Hermite did so in 1873. Hermite used the 
basic identity 


/ e °F(z)dz = —e “y(z), 


where F(z) and y(z) were polynomials. Note that his can be proved by integration by 
parts and depends on the fact that # e * = —e “. By means of this formula, Hermite 
defined certain polynomials with integer coefficients and employed them to obtain 
simultaneous rational approximations of e*, for certain integer values of x. These in 
turn were sufficient to show that, except for the trivial case, there could be no equation 
of the form 


eONo tein; +---+e"N, =0, (35.1) 


when Z,Z1,---,Zn and No, N,...,N, were all integers. We note that in the last 
portion of his paper, Hermite used his method to obtain the rational approximations 


__ 58019 » _ 157712 
= 91344" ® 01344" 


He left the problem of proving the transcendence of z to others. And soon 
afterward, in 1883, Ferdinand Lindemann used Hermite’s methods to prove this.!% 
Lindemann’s theorem was a generalization of Hermite’s: If zo, z1, ... ,Z, were distinct 
algebraic numbers and No, Nj, ...,N, were algebraic and not all zero, then equation 
(35.1) could not hold. The equation 1 + e’” = 0 implied the transcendence of z. Lin- 
demann argued that if 2 were algebraic, then by the preceding theorem, 1 + e’* could 
not equal zero. Lindemann’s theorem also implied that when x 4 0 was algebraic, 
then all the numbers e*, arcsinx, tanx, sin~! x, and tan~! x were transcendental. 
Moreover, if x was not equal to one, then logx was transcendental. Lindemann’s 
proof was somewhat sketchy, but in 1885 Weierstrass gave a completely rigorous 
proof.!* In particular, he noted that Lindemann’s theorem followed readily from the 
particular case in which No, Nj, ...,N, were integers. A number of mathematicians, 
including Hilbert, Hurwitz, Markov, Mertens, Sylvester, and Stieltjes improved and 
streamlined the proofs of Hermite and Lindemann without introducing any essentially 
new methods or results. 


!3, Lindemann (1992). 
14 Weierstrass (1885). 
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In his famous 1900 lecture at Paris, David Hilbert (1862-1943) gave a list of 
twenty-three problems for future mathematicians; the seventh of these was to prove 
the transcendence of certain numbers: !> 


I should like, therefore, to sketch a class of problems which, in my opinion, should be attacked 
as here next in order. That certain special transcendental functions, important in analysis, take 
algebraic values for certain algebraic arguments, seems to us particularly remarkable and worthy 
of thorough investigation. Indeed, we expect transcendental functions to assume, in general, 
transcendental values for even algebraic arguments; and, although it is well known that there exist 
integral transcendental functions, which even have rational values for all algebraic arguments, 
we shall still consider it highly probable that the exponential function e!™2, for example, which 
evidently has algebraic values for all rational arguments z, will on the other hand always take 
transcendental values for irrational algebraic values of the argument z. We can also give this 
statement a geometrical form, as follows: If in an isosceles triangle, the ratio of the base angle 
to the angle at the vertex be algebraic but not rational, the ratio between base and side is always 
transcendental. In spite of the simplicity of this statement and of its similarity to the problems 
solved by Hermite and Lindemann, I consider the proof of this theorem very difficult; as also the 
proof that The expression a, for an algebraic base a and an irrational algebraic exponent B, 


e.g., the number 2v?, or e™ = j7!, always represents a transcendental or at least an irrational 
number. It is certain that the solution of these and similar problems must lead us to entirely new 
methods and to a new insight into the nature of special irrational and transcendental numbers. 


Hilbert’s last comment has certainly turned out to be true. The resolution of Hilbert’s 
seventh problem in the 1930s by the efforts of A. O. Gelfond and T. Schneider and the 
work of C. L. Siegel, the latter more directly inspired by Hermite and Lindemann, have 
initiated an era of tremendous growth and development in the theory of transcendental 
numbers. Hilbert himself was not very hopeful of a proof of his theorem within his 
lifetime, a theorem, as we have seen, also stated by Euler. Hilbert thought, in fact, that 
the Riemann hypothesis would be proved first. 

The Russian mathematician Aleksandr O. Gelfond (1906-1968) took the first 
important step toward a proof of the Hilbert-Euler conjecture. He was a colleague 
at Moscow University of I. I. Privalov, whose influence in complex analysis is evident 
in Gelfond’s work. Gelfond was a student of Aleksandr Khinchin who in 1922-23 
studied the metrical properties of continued fractions, in which he obtained important 
results. Khinchin attracted several researchers to a whole range of problems in analytic 
number theory through his 1925—1926 seminar on this subject at Moscow. Gelfond’s 
early work was influenced by a result in analytic functions due to Pélya:!° If an entire 
function assumes integral values for positive rational integral values of its argument 
and its growth is restricted by the inequality 


Lf (| < C2%7!, aw <1, 


then it must be a polynomial. Roughly speaking, this means that a transcendental entire 
function taking integral values at integers must grow at least as fast as 2. Concerning 


15. Yandell (2002) p. 404. 
16 Pélya (1974) vol. 1, pp. 1-16. 
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the connection of this result with transcendental numbers, in his Transcendental and 
Algebraic Numbers, first published in Russian in 1952, Gelfond wrote,!7 


There is a very essential relationship between the growth of an entire analytic function and the 
arithmetic nature of its values for an argument which assumes values in a given algebraic field. If 
we assume in this connection that the values of the function also belong to some definite algebraic 
field, where all the conjugates of every value do not grow too rapidly in this field, then this at 
once places restriction on the growth of the function from below, in other words, it cannot be too 
small. This situation and its analogs for meromorphic functions can be used with success to solve 
transcendence problems. The first theorem concerning the relationship between the growth and 
the arithmetic value of a function was the Pélya theorem. 


Hardy, Landau, and Okada successively managed to produce a streamlined proof of 
this result. We briefly sketch the argument, showing that very old ideas on interpolation 
have continued to play a role in function theory and transcendence theory. First, prove 
that if an entire function f(z) satisfies 


Sar: log M(r, f) 


< log 2, 
r>0o r 
then 
—1 —1 =). 
£0) + 2 F0) + = 0294 SIE ps. 


converges uniformly to f(z) in any finite region of the plane. Note that this is the 
Briggs—Harriot-Gregory—Newton interpolation series. Thus, if f(z) is of exponential 
type less than log 2, then f(z) is represented by the interpolation series and can be 
evaluated at z = —1: 


f(-1) = fO) — Af) + A*fO)----. 


The convergence of the series implies that |A” f(O)| < 1 forn > N. Moreover, 
since A” f(0) is an integer when f(0), f(1), f(2),... are all integers, we may 
conclude that A” f(0) = Oforn > N. Thus, f(z) is a polynomial and Polya’s theorem 
is proved. 

In 1929, Gelfond took a step closer to solving Hilbert’s problem when he used this 
type of interpolation series to obtain a key transcendence theorem:!® For w 4 0, 1 and 
algebraic, a¥~? is transcendental when p isa nonsquare positive integer. In particular, 
2v~ and (—1)~! = e” are transcendental numbers. Gelfond gave details of only the 
particular case that e” is transcendental. We present an outline of Gelfond’s proof. 
First enumerate the Gaussian integers m + in as a sequence Zg, Z1,Z2,..., Where one 
term precedes another if its absolute value is smaller; if the absolute values are the 
same, then the term with the smaller argument comes first. Then expand e”* as an 
interpolation series 


!7 Gelfond (1960) p. 97. 
18 For a translation, see Gelfond (1960) chapter 3. 
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» Are — 70) ++: (2 — Zn-1); 


n=0 
where, by Cauchy’s theorem, A, can be expressed as 


n er rk n 
Ds Bn where Be = | [x -<)). 
k ; 
j=0 


k=0 
i#k 


This interpolation series converges to e”* because of the relatively slow growth 
of the function and the relatively high density of the interpolation points. Now if Q, 
is the least common multiple of Bo, Bi, ..., By, then, by the distribution of the primes 
of the form 4n + 1 and 4n + 3, it can be established that 

\Qn] =e 2 +9 and a = 69), 
k 


If one assumes that e” is algebraic, then these estimates can be used to show that 
either A,, = 0 or that 


[SAnhsee Oe, 


However, from the Cauchy integral for An, it follows that 


IQnAn| < en 2 FO, 


The two inequalities contradict one another for large enough n, unless A, = 0 for 
all n larger than some value. Thus, one may argue that the interpolation series is finite 
and hence is a polynomial. This is a contradiction, so that Gelfond could conclude that 
his assumption that e” was algebraic was false, proving his result. 

In 1930, R. Kuzmin showed that,!9 with some modifications in Gelfond’s proof, one 
could prove the transcendence of w¥?, with a and p as before. One implication of this 
was that 2¥? was transcendent, as Hilbert and Euler had conjectured. Since for general 
algebraic numbers f (in a), it was no longer possible to find useful upper bounds for 
Qy, a generalization along these lines was difficult. However, in 1933 K. Boehle was 
able to prove by this method that if wa 4 0,1 and 6 was an irrational algebraic number 
of degree n > 2, then at least one of the numbers a, a, oP, ad ae had to be 
transcendental.” Carl Ludwig Siegel (1896-1981), who had been a student of Landau 
at Gottingen, also succeeded in proving Kuzmin’s result after seeing Gelfond’s proof 
of the transcendence of w¥~?. But Siegel did not publish his proof in spite of Hilbert’s 
suggestion that he do so. Siegel also made important and very original contributions 
to the theory of quadratic forms and to modular forms in several variables. His interest 
in the history of mathematics led him to study Riemann’s cryptic unpublished notes 
on the zeta function and to discover the Riemann-Siegel formula. 


19 Kuzmin (1930). 
20 Boehle (1933). 
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Though the method of Gelfond did not generalize to a’, it suggested new lines 
of research. Gelfond himself applied it to a new proof of Lindemann’s theorem; 
in 1943, his student A. V. Lototskii used it to show that certain infinite products 
represented irrational numbers;2! and in 1932 Siegel showed that if g2 and g3 were 
algebraic numbers, then at least one period of the Weierstrass 9 function, satisfying 
the equation, 


9! (2)" = 49 (2)? — g29 (2) — 83, 


was transcendental.?* In particular, if g(z) allowed complex multiplication, then 
both periods were transcendental. Siegel’s student, Theodor Schneider (1911-1988), 
developed improved methods, allowing him to prove in 1934 that both periods were 
transcendental and even their ratio was transcendental, except when so (z) permitted 
complex multiplication.?+ 

In 1934, Gelfond published a new method** by which he obtained the complete 
proof of Hilbert’s seventh problem. This proof made use of complex analysis, but 
some years later Gelfond and Linnik gave an interesting elementary proof of a special 
case, without recourse to analysis, except for Rolle’s theorem.”> In his proof of the 
seventh problem, Gelfond assumed the result false. Thus, he posited that there existed 
algebraic numbers a, 6, where a # 0, 1 and B was not rational but a’ = A’ was 
algebraic. On this assumption, there existed algebraic numbers @ and f such that 


p= wee was an algebraic irrational number. He then constructed a function 


N ON 
{@= a Se Ckm ake qmz — Cee (a = loga, b = logd), 
k=0 m=0 k,m 


where N was a suitably chosen large integer. Also, the Cz, were such that their 


: : 2 
absolute values and the absolute values of their conjugates were less than e*”. Note 
that f(z) could not be identically zero because had to be irrational; moreover, the 
derivative of order s could be expressed as 


N WN 


f(z) _— - oy Cim(k + Bmys oh XA", 


k=0 m=0 


Gelfond proved that if w’ was an algebraic number, then it was possible to choose 
the (N + 1)” nonzero algebraic numbers Cx», such that FO) = Oatz=0,1,...,1m 


2 % 2 
for 0 < s < rj, where r; was the greatest integer in — and rp was the greatest 


log N 
integer in log log N. All this then implied that f(z) had zeros of sufficiently high order 
at 0,1,...,r2. By an ingenious argument using Cauchy’s integral formula, Gelfond 


21 Lototskii (1943). 

22 Siegel (1932). 

23 Schneider (1934) II. 

24 Gelfond (1934). 

25 Gelfond and Linnik (1966). 
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then showed that f(z) had a zero of even higher order at z = O0—in fact, of order at 
least (N + 1)? + 1. Thus, he could conclude that the nonzero algebraic numbers Cx 
satisfied the equations 


N 
a fO)= DY) Cym(k+ Bm) =0, OSs <(N +1). 
k,m=0 


Taking the first (NV + 1)* equations, he obtained a system of equations with a 
Vandermonde determinant that had to be zero; this could happen if and only if there 
were integers 


mi,ky; m2,k2 suchthat Bm, +k, = Bm2+khko. 


This relation yielded the conclusion that 6 was rational, a contradiction to 
Gelfond’s assumption, so that the theorem was proved. 

In 1934, Schneider obtained an independent solution of Hilbert’s seventh prob- 
lem.”° His interest in transcendental numbers was aroused by a lecture of Siegel, who 
subsequently gave him a list from which to choose a dissertation topic. Schneider 
selected a problem on transcendental numbers; he reported,*’ “After a few months, 
I gave him a work of six pages and then was told by Siegel that the work contained the 
solution of Hilbert’s seventh problem.” Schneider’s proof was different in details from 
Gelfond’s, but it too depended on the construction of an auxiliary function with a large 
number of zeros at specific points. In fact, both these mathematicians had adopted this 
technique from a previous work of Siegel on transcendence questions related to the 
values of Bessel functions. One may go further back and observe that in his proof of 
the transcendence of e, Hermite had also constructed a function of this kind! 

Siegel’s work of 1929 introduced another important method in the theory 
of transcendental numbers. Recall that Hermite’s work depended on the fact that 
4 e* = e*, It was not until 1929, when Siegel published his paper on E-functions, 
that this idea was generalized to prove the transcendence of values of functions 
satisfying linear differential equations. E-functions are entire functions 


where a, are algebraic numbers satisfying certain arithmetic conditions. First, for 
any € >0, a, and all its conjugates are O(n") as n —> oo and second, the least 
common denominator of ao, a1, ...,dy is also O(n*”). Siegel considered a system of 
homogeneous linear differential equations of the first order 


m 
Ye = OQn(x)y, for (k=1,...,m), 
l=1 


26 Schneider (1934) I. 
27 Yandell (2002) p. 199. 
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where the Q,;(x) were rational functions with coefficients in a number field K. 
To obtain transcendence results, Siegel required that some products of powers of 
the E-function Fj, £2,..., Em, in fact, solutions of this system, satisfy a normality 
condition. Siegel formalized the concept of normality in his 1949 book;78 its meaning 
was only implicit in his 1929 paper. In spite of the fact that this condition was difficult 
to verify, thereby limiting the scope of its application, Siegel was able to employ it to 
rederive the classical theorem of Lindemann (and Weierstrass). He also proved a new 
theorem on the transcendence of a class of numbers related to the Bessel function: 
Observing that 


_<~ ey x\" 
KO =) QED OF ) a 


one may verify that K, is an E-function and satisfies the differential equation 


20+ 1 
y+ oy’ + y=0. 
x 
Siegel’s theorem states that if A is a rational number, A # +) -1,4 3, Di inset 
and a # 0 an algebraic number, then K,,(a) and K; (a) are algebraically independent. 


Note that complex numbers 1, f,...,, are called algebraically independent if for 
every nonzero polynomial P(x1,...,Xn), inn variables with rational coefficients, we 
have 

P(Q1,.-.-,0n) # 0. 


Otherwise, the ¢; are algebraically dependent. Thus, if several numbers are 
algebraically independent, then each of them is transcendental. Therefore, K, (a) and 
K ‘ (a) are transcendental. Also, since the Bessel function J, (x) may be expressed as 


rary (3) m0 


it follows that except for x = 0, all the zeros of J, (x) and Jy (x) are transcendental 
numbers. 

From his theorem, Siegel obtained the transcendence of certain continued fractions 
by noting that 
Kj(2i,/x) x 

- =At+l14 
Kj (2i./x) " x 
MRS Ses 


i/x 


Thus, Siegel’s theorem implied that when 2A was not an odd integer, the continued 
fraction was transcendental for every nonzero algebraic x. But when 2 was an odd 


28 Siegel (1949). 
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integer, Lindemann’s theorem entailed the transcendence of the continued fraction. 
Siegel took the special case, when A = 0 to obtain a nice result: the transcendence of 
1 
2 1 
Se 


1+ 
2 


Siegel obtained the Lindemann—Weierstrass theorem using his method: Take alge- 
braic numbers a), ...,@, linearly independent over the rational number field. The 
E-functions are Ex(x) = e%*, (k = 1,...,m), and the ww power products take the 
form e?* (k = 1,...,m) with uw different algebraic numbers pz. The system of 
equations takes the form Yi; = peye (k =1,...,); verifying the normality condition 
in this case reduces to proving that any equation 


Py(xje?* +--+ + Pu(xyee* =0, 


where P;(x) are polynomials, implies that P; = 0,..., P,, = 0. This is easy to show, 
and the Lindemann theorem follows from Siegel’s theorem. 
Siegel described the historical background to his work:?? 


Lambert’s work was generalized by Legendre who considered the power series 


o° n 


= = a eae heme) 
yO 2 a ee (a #0, —1,—2,...) 


satisfying the linear differential equation of second order y”-++ay’ = y. He obtained the continued 
fraction expansion 


XxX 
a + 1+ ——_ 
a+2+.--- 


and proved the irrationality of for all rational x # 0 and all rational aa 4 0, — 1, — 2,.... 
In the special case a = 5 we have y = cosh(2,/x), y’ = sme so that Legendre’s theorem 
contains the irrationality of ea for rational a2 + 0. In more recent times, Stridsberg proved the 
irrationality of y and of y’, separately, for rational x 4 0 and rational a # 0, — 1,..., and 
Maier showed that neither y nor y’ is a quadratic irrationality. Maier’s work suggested the idea 
of introducing more general approximation forms which enabled me to prove that the numbers y 
and y’ are not connected by any algebraic equation with algebraic coefficients, for any algebraic 
x # 0 and any rational a 4 0, + 5 -1+ 3, .... The excluded case of an integer a + 5 really 
is an exception, since then the function fq (x) satisfies an algebraic differential equation of first 
order whose coefficients are polynomials in x with rational numerical coefficients; this follows 
from the explicit formulas 


1 3 1 
Fee 55 («- ;) D* cosh(2,/x), 
1 
= k k+5 
fi_,= ee sinh(2/x) (k =0,1,2,...). 


29 ibid. pp. 31-32. 
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For instance, in case a = 5 the differential equation is y? — xy’ 2 = 1. In the excluded case, 
however, Lindemann’s theorem shows that y and y’ are both transcendental for any algebraic 


x £0. 


Due to the difficulty in verifying the normality condition, only these examples 
involving the exponential function and the Bessel function were obtained by this 
method between the publication of Siegel’s paper in 1929 and his book in 1949. 
Finally, in 1988, F. Beukers, W. D. Brownwell, and G. Heckmann applied differential 
Galois theory to obtain a more tractable equivalent of the normality condition.°? 
They were able to verify the normality condition for a large class of hypergeometric 
functions. In their theory, the algebraic relations between the solutions of differential 
equations could be studied by means of the classification of linear algebraic groups. 
We note that the work of E. Vessiot, G. Fano, and E. Picard on linear differential 
equations during the late nineteenth century provided the foundation for differential 
Galois theory. Starting in 1948, E. Kolchin’s work, itself based on the earlier 1932 
book of J. F. Ritt, brought differential Galois theory to maturity. 

In the period 1953-1959, Andrei Shidlovskii (1915-2007), student of Gelfond and 
teacher of V. A. Oleinikov, made major advances in the theory of £-functions. In 
1954, he was able to replace Siegel’s normality condition with a certain irreducibility 
condition, enabling him to work with some E-functions satisfying third- or fourth- 
order linear differential equations. A year later, he obtained stronger results; we give 
definitions before stating one of his theorems. Functions f(z), f2(z),..., f(z) are 
homogeneously algebraically independent over C(z) if P(fi(z), ..., fm(z)) 4 0 for 
every nonzero homogeneous polynomial in m variables with coefficients in C(z). 
Similarly, complex numbers w1,...,w , are said to be homogeneously algebraically 
independent over the field of algebraic numbers if P(w1,...,w,) #~ O for every 
nonzero homogeneous polynomial P with algebraic numbers as coefficients. Now 
suppose 


y= > Onivi K=L....m), Oni € CQ), (35.2) 


and suppose T(z) is the least common denominator of all the m? rational functions 
Qx,i- Shidlovskii’s theorem may then be stated as: Let f(z), fo(z), ..., fin(z) be a set 
of E-functions that satisfy the system of equations (35.2) and are homogeneously 
algebraically independent over C(z), and let ¢ be an algebraic number such that 
¢T(¢) # O. Then the numbers f|(¢),..., fin(¢) are homogeneously algebraically 
independent.*! 

We may get an idea of the mathematical tradition within which Gelfond, 
Shidlovskii, and their students did their work by reading Mikhail Gromov’s comments 
on his experience as a student in Russia:** “There was a very strong romantic 
attitude toward science and mathematics: the idea that the subject is remarkable 
and that it is worth dedicating your life to. ... that is an attitude that I and many 


30 Beukers, Brownawell, and Heckman (1988). 
31 See Shidlovskii (1989) chapter 3. 
32 Raussen and Skau (2010) p. 392. 


330 Irrational and Transcendental Numbers 


other mathematicians coming from Russia have inherited.’ The accounts of the 
Gelfand seminars in Moscow, by Gromov, Landis and others, describe this attitude. 
The seminars extended to many hours of enthusiastic, colorful, and passionate 
discussion.*? 

During the 1960s, Alan Baker, a student of Harold Davenport, effected another 
important and very productive development in transcendental number theory. He 
proved a substantial generalization of the Gelfond—Schneider theorem of 1934. We 
may State the latter in the form: If w and 6 are nonzero algebraic numbers and log a 
and log f are independent over the rationals, then for any nonzero algebraic numbers 
a; and £1, 


a, loga + f, log B £ 0. 


In 1939, Gelfond obtained an explicit lower bound for |a1 log a+ A; log | in terms of 
the degrees and height of the four algebraic numbers.** In a paper of 1948, Yuri Linnik 
and Gelfond pointed out that if a lower bound could be obtained for a similar three- 
term sum, then it would follow that the number of imaginary quadratic fields of class 
number one was finite; note that this result was one case within Gauss’s class number 
problem. In 1966, Baker began to study this question by means of linear forms in 
logarithms.*° In that year, he established that if a1,a@2, ...,@, were nonzero algebraic 
numbers such that loga;, logaz, ..., loga@, were independent over the rationals, 
then 1, loga;,..., loga@, would be independent over the field of algebraic numbers. 
As a corollary, Baker obtained the generalization of the Gelfond—Schneider theorem: 
If aj,@2,...,Q@, are algebraic but not 0 or 1; and if 61, 6o,...,By are algebraic 
numbers such that 1, 61, 62,..., 6, are linearly independent over the rationals, then 
" is transcendental. Baker also found an effectively computable lower 
bound for the absolute values of a nonvanishing linear form 


atigf...af 


|Bo + Bi logay +--++ By log ay|. 


This result was applicable to a number of outstanding number theory problems, 
including Gauss’s class number problem.*° But some of these number theoretic 
problems were also solved by other methods. In 1967, H. M. Stark solved the class 
number one problem by means of the theory of modular functions. Two years later, 
he published another paper, explaining that Kurt Heegner’s 1952 solution of this 
problem was essentially sound.*” In constructing his proof,>® Heegner, a secondary 
school teacher, had made use of his deep understanding of Heinrich Weber’s work 
on modular functions. Perhaps Heegner took it for granted that his readers would be 
equally familiar with Weber; this may have rendered his proof opaque. Indeed, Serre 
remarked that he found the paper very difficult to understand. 


33 See, for example, Zdravkovska and Duren (1993) p. 69. 
34 Gelfond (1939). 

35 Baker (1979) chapter 2. 

36 ibid. Chapter 5. 

37 Stark (1967) and (1969). 

38 Heegner (1952). 
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It is remarkable that mathematicians have managed to learn so much about 
transcendental numbers, for they have very strange properties. For example, transcen- 
dental numbers do not behave well under the usual algebraic operations. Moreover, 
even though it is true that if a number can be approximated sufficiently well by rational 
numbers, it must be transcendental, there nevertheless exist transcendental numbers 
that are not able to be approximated even as well as some quadratic irrational numbers. 
Indeed, Weil has stated that a preliminary version of Siegel’s 1929 paper ended with 
the remark: “Ein Bourgeois, wer noch Algebra treibt! Es lebe die unbeschkrinte 
Individualitat der transzendenten Zahlen!” [It’s a bourgeois, who still does algebra! 


Long live the unrestricted individuality of transcendental numbers!]”.°? 


35.2 Liouville Numbers 


In a paper of 1851,*° based on earlier work, Liouville constructed his transcendental 
numbers by proving that if x was the root of an irreducible polynomial of degree n > 1 
with integer coefficients 


f(x) =ax" + bx"! 4..-4 ex +h, 


then there existed a constant A > 0 such that |x — A > Agr for all rational numbers 


e Although the absolute value sign was not in use at that time, Liouville made 
his meaning clear. To prove this theorem, Liouville supposed that x,x1,x2,...,%n-1 
comprised all the roots of f(x) = 0 so that 


H()=o(E-)(E-s)-E-m) 


He then set 


so that he could write 


oo B) 0-8) (0-2) 


39 Weil (1992) p. 53. 
40 Liouville (1851). 
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was bounded by a maximum value for values of 7 in a neighborhood of x (of, say, 
radius 1). Liouville denoted that maximum by A. It then became clear that 


1 


so that the proof was complete. Note that for points : outside the radius 1 around x, 


one has | : —x| > 1, so that the result holds. Liouville went on to show that the result 
was valid even when n = 1. In that case, f(x) = ax + b = O and so 


Pp Se ess 
q aq 


Ifx ¢ @> then ap + bq # 0, and 


aq Aq 


Liouville used his theorem to produce examples of transcendental numbers. He argued 
that a given number x could not be algebraic unless there was a constant A such that 


aa . He took x to be defined by the series 


Dy 
ie x| > 


Be gi tt op Ps ae 
1° 72! ¢ PB! 


x= PE a 


where / was an integer > 2. He let the partial sum up to the term whose denominator 
was I’"' be a so that g = /’"'. Liouville then observed that 


p_ it 1 ; 2 2 _ 2 
gq lant!" pm +21 | "= ]am+D! gmt 


Xx T 


This inequality followed from the series 


1 ise 1 1 
jam+D! Tmt TS [n+ D(m42) 1 


an cere 2 
<GeeDe eg  on eee 


By increasing m, he saw that for any fixed A and n he could not obtain x — : > roan 


thus proving that x was transcendental. Liouville also noted the more general case; if 
he took 


— ky | ky | ks | | km | 
2 Ee Megane ne gaa 9 
where kj,k2,...,km,... were nonzero integers bounded by a constant, then x would 


be transcendental. He gave the example in which / could take the value 10 and the k,, 
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could then take values between | and 9 inclusive. As an example of a slightly different 
kind, he considered 


= 1 | 1 | 1 | | 1 | 
x= 1 T /4 T pp T 1 jm? 1 
For q =", 
Pp _ 1 2. 
F = jot? pore mtg’ 


therefore, x could not be a root of a first-degree equation with rational coefficients and 
was hence irrational. 


35.3 Hermite’s Proof of the Transcendence of e 


In his 1873 paper “Sur la fonction exponentielle,’ Hermite gave two proofs that e was 
transcendental.*! We give his second and more rigorous proof, following his notation 
for the most part, except that in some places we employ the matrix notation, not 
explicitly used by Hermite. About fifteen years before Hermite gave his proof, Cayley 
introduced the matrix notation and some of the elementary algebraic properties of 
matrices. It was some time, however, before the usefulness of matrices was generally 
recognized. We first sketch the structure of Hermite’s argument. Take a relation of the 
form 


eONo tein, +---+e"N, =O, (35.3) 


where Z0,2Z1,---,Zn are distinct nonnegative integers and No,Nj,...,N, are any 
integers. It is clear that unless all the N are zero, e is an algebraic number. Hermite 
defined a set of n(n + 1) numbers ni, i = 0,1,...,n and j = 1, 2,...,n, by the 
equation 


1 zj e-z fm 
ni = / oD 5, (35.4) 


(m—1)! Jeo 2 Zi 


where m was some positive integer and 


f (2) = @ — Zo)(Z — 21) (Z — 22) ++ (Z — Zn). (35.5) 


He showed that the numbers ni got arbitrarily small as m became large. To 
demonstrate this fact, Hermite’s reasoning was that since e~* was always positive, 
for any continuous functions F(z), he had 


Z Z 
/ e*F(z)dz= re) | e<dz= Fé)(e* —e%), 
z £0 


0 


41 Hermite (1900-1917) vol. 3, pp. 150-181. 
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where & lay between Zo and Z, the limits of integration. By choosing Z = z; and 


(2) 


F = 
SIE 


he obtained 


i £6) £O (pe _ er) 
tea e=ye 


This proved that ni — Oasm — oo. Thus, the (n + 1) x n matrix n = (ni). 


with 7’. the element in the ith row and the jth column, depends on m. Denote this 
dependence by n(m). Hermite determined a relation between n(m) and nim — 1) 
giving, by iteration, a relation between 7(m) and n(1). Let us write the first relation as 


n(m) = O(m)n(m — 1), (35.6) 
where @©(m) is ann + 1 x n+ 1 matrix depending on Zo, Z1, ...,Z. Thus, 
n(m) = O(m)O(m — 1)--- O(2)n (1), 


and we write (following Hermite) the element in the ith row and jth column of the 
matrix © as 6(j,i) where i and j run from 0 to n. Hermite showed that the @s were 
integers and that 


detO(k)= [] (iz), for k=2,...,m. (35.7) 


O<i<j<n 


He then obtained an explicit expression for the elements of (1) in a suitable form: 


Let ¢ denote any one of zg, Z1, ..., Zn. He set 
FO= ff) 
Cat 
and 
/ e *F(z)dz = —e *y(z), (35.8) 
where y(z) = F(z) + F’(z) + F’(z) +-:- + F(z). Hermite noted that if 


F@ Het + piz? + paz™ | +e + a, 
then 


FOS PCH Be OE C74 pe pe (35.9) 
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and the coefficients of the two polynomials were integers. From this he could 
conclude that 


y(z) = (2,6) = 2" +. b1(E)z" | + bo()z" 7 +++ + bn), (35.10) 


with ¢;(¢) a monic polynomial in ¢ of degree i and with integer coefficients. We may 
let @ denote the matrix with entry ®(z;,z;) in the ith row and jth column where i 
and j run from 0 to n. Again, these entries must be integers and det ® = det ©. From 
(35.8) and (35.10), he then had 


Z p-Z 
/ — dz= e © D(zo,f) = e “(Z,C), (35.11) 
z0 7 


giving Hermite the required values of the entries of (1). For the final step, let 
X = O(m)...©(2)® so that elements ni of n(m) are given by 


ni = e Xi _ e *iX;,, (35.12) 


where the integers X;; are the entries of X. Note that (35.12) gives rational approxi- 
mations of e*/~*° for 7 running from 1 to n. Now, by (35.3) and (35.12), 


en Ny +e? myNo +++ +e", Nn 
= e720 (e'N a e2 No Scere e" Nn) xX 
— (Xi M1 + Xi2N2 +--+ + XinNn) 
— (XjioNo + Xi Ni + Xi2N2 +--+ + XinNn)- 


Hermite argued that, since the X;; and the N; were integers, the term on the right- 
hand side was an integer, but the term on the left-hand side could be made arbitrarily 
small because of the n'. Therefore, he concluded that 


XioNo + Xi Ni t+::-+XinNn =0, i=0,1,...,n. 


We can write this system of equations as XN = 0, where the components of the 
vector N are No, Nj,...,Nn. Since det X = (det ©)! det b = (det ©)” 4 0, we 
must have N = 0. This completes our outline of Hermite’s proof. 

Now let us see how Hermite obtained the basic formulas (35.6) and (35.7). To prove 
(35.6), Hermite showed that 


Z 4-z ¢mtil ee fm et fm 
/ ss A dz = m0(zo, ofS a = ——— dz+m6(z1, i — a 
Z0 q 


+++ + mO(Zn,6) ee, (35.13) 
ZO z — Zn 
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where 0(z,¢) was of the form 


O(z,6) = 2" +a (e)2" | + a2 (E)z-7 ++ + ata), (35.14) 


with a1(f),a@2(f),...,@n(€) monic polynomials in ¢, with integer coefficients, and 
where Z and ¢ took values in zo, Z1, ...,Z- For this, he needed the auxiliary formula 
that there existed polynomials 6(z) and 6) (z) of degree n, such that 


| e~2G(z) f(z) tes f e-*G(z)01(z) 
z—-¢ f (2) 


where G(z) = ( tf (z))”. After taking the derivative of (35.15) and multiplying across 


by os he had only to determine 6; (z) and 6(z) so that 


dz —e *G(z)6(z), (35.15) 


f(z) 
aoa 4 


_ GG) 
G(z) 


f(z) = 01(z) + [ F@)O(z) — F@)O"(). (35.16) 


He set z = z; in this equation, and got 0 = 6; (z;) — mf’ (z;)0(z;) or 
Ae =m Qe), 1=O01,iem. (35.17) 


Once the values 6(z;) were found, the n + | values determined the polynomials 6 (z) 
and 6(z). To this end, he divided equation (35.16) by f(z) to get 


f@) — AG) | [ G'(z) 


= 0 6'(z). 35.18 
gat FO a (z) (2) ( ) 


Next, by (35.17), the fractional part of [i i $8 | 6(z) cancelled with ee and 
hence to determine @(z) Hermite had to consider only the polynomial part of 
[1 - a] 0(z). So he supposed 
A(z) = az” +ayz” | fanz"? +--+ +a. 
By taking the logarithmic derivative of G(z), he obtained 


G'(z) m m m sO. ST 83 
= ! bees = St 5tSte, 65.19 
G(z) Z%-2% «42-241 Bigs Be ee ( ) 


where s; = m(zi, + aA +-..+2!). Thus, comparing the coefficients of the polynomials 
on the two sides of (35.18) and using (35.9), he got the relations 
1 =a, 
o + pi =a — ao(so +n), 


C74 pig + pr = a2 +.01(s59 +2 — 1) — a5), 
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These equations yielded the required coefficients of 6 (z): 


ag = 1, 
a =C+pitso+n, 


a= + (sotn—1)hy + (sotn)(sotn—1)+51, 


where 6 = 67+ pi€+p2, 6) = +p). Thus, a; was shown to be a monic polynomial 
of degree i in ¢, and Hermite could write 


A(z) = 0(z,6) = 2" $on(E)z"| + ar(E)e" 7 +--+ + on(6). (35.20) 
But in order to derive (35.13), Hermite set the limits of integration in (35.15) from 
zg to Z, where Z was one of the values zo, Zz], ..., Zn; he arrived at 
Z 2G Z fa) 
( mE / eGo! e (35.21) 
Z0 Acad | Z0 f@ 
By (35.17), 
A(z 6 6 6 
1@) _ mO(Zo) mu (z1) bie Gn) (35.22) 
f(z) Z— 20 2-21 Z—Zn 


and then, in order to recognize the dependence on ¢, he wrote, as in (35.20), 6(z;) = 
6(z;,¢). And when (35.22) was substituted in (35.21), he got (35.13). 

Now, in order to prove (35.7), observe that from (35.20), det © can obtained by 
multiplying the determinants 


ae ean | 1 Ts “hee. ad 
EE outs NI) », JORG): OLA) ay. TRH) 
ge gl ne 1) fea) On (Zi) «5% “On Gn) 


completing Hermite’s proof of (35.7). 


35.4 Hilbert’s Proof of the Transcendence of e 


In 1893, Hilbert’s presented a very efficient proof** of the transcendence of e. His 
elegant proof was based on ideas of Lindemann, Weierstrass, and Paul Gordan. To 
begin the proof, take ¢ to be a positive integer, and set 


T=2?[(z—1)(z—2)+« (gz —n) ot em. 


42 Hilbert (1893). 
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Suppose e is not transcendental. Then we may set a,aj,d2,...,d, to be integers 
such that 
\ { 24 | co 
ataje+ame 4 + d,e" = 0 
Then 


lore) loo) ei oo n loo) 
“| face | fare] Tae if dz 
Pp: Jo Pp: JI p+ J2 P: Jn 


1 2 7/2 n oen 
+ (SE fo race Se race + 2S fo raz) =0, 05.23) 
p 0 p! 0 p! 0 


or P; + P2 = 0, where P> is the sum inside the parentheses and P, is the part outside. 
In the term 


k lee) 
age 
at I dz, 
k 


p! 


with k > 1, contained in P;, change z to z + k. We then have 


k poo 
a e &th) P+] (74 RP (24 K—1)P +1. iz (z4+1)oth(z—1)e+1. ae (z+k—n)et! dz 
: JO 


[oe] 

ak as 

=a, 8 Laine” ae, 
JO 


where > tmz” is a polynomial in z with integer coefficients. Take one term in the sum, 
and evaluate as a gamma integral to get 


ea ptm+1 7, _ %klm(p +m-+ 1)! 
nal e “Zz dz= mI ; 


Therefore, 


Pp! Sk 
is an integer divisible by p + 1. The first term ri ee I dz in P, is easily seen to be 
+a (n!)?*! (mod p + 1), 
and hence 
P, = +a(n!)?*! (modp +1). 


Take p +1 to be a large prime so that a (n! )?*! is not divisible by p +1. Notice that 
we can obviously choose a # 0, if e is algebraic. As for P2, we can make it as small as 
we like. But a (n! )?*! is a nonzero integer, contradicting (35.23), and hence e cannot 
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be algebraic. Note that Hermite’s original proof still has the advantage that it obtains 
rational approximations of e raised to integer powers. 


35.5 Exercises 


(1) Show that the sum of Goldbach’s series is given exactly by 


— 1g B\c02 ne 
Deena) 


f=]. 


where y(x) is Gauss’s digamma function. Observe that the value of (2) may 
be explicitly calculated for integer values of p and q, as Goldbach used them. 
See the results of Gauss in Chapter 17, Exercise 11. 


(2) Suppose A is a rational number not equal to a negative integer. Let 


wm 


o° n 


z 
y(Z) = SS Q+Dn 


n=0 


and let € be a nonzero algebraic number. Show that @,(&) is transcendental. 
Siegel stated this result without proof in his 1929 paper. The first published 
proof is due to Shidlovskii, dating from 1954. See Shidlovskii (1989) p. 185. 
Suppose that the E-functions /f|(z),..., fin(z) are algebraically independent 
over C(z) and form a solution of the system of linear differential equations 


(3 


wm 


m 
y= Oeo0+ >) Oni, K=l....m; Oxi € CQ). 
i=0 


Let € be an algebraic number such that €T(€) 4 O, with T(&) as defined 
earlier. Show that under these conditions, the numbers /1(€),..., fin(&) are 
algebraically independent. See Shidlovskii (1989) p. 123. 

(4) Let 


fos ea 
n=0 


Show that if a is algebraic and 0 < |a| < 1, then f(q@) is transcendental. 
This result is due to Kurt Mahler (1903-1985); though mostly self-taught, he 
regarded himself as a student of Siegel in his research. For this and other results, 
see the paper by J. H. Loxton and A. J. van der Poorten in Baker and Masser 
(1977) pp. 211-226. 

(5) Show that if 


A@=)o2" and f@ = 02", 
n=0 n=0 
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then for any two algebraic numbers a1, a2 in0 < |z| < 1, fi(a1) and fo(a2) 
are algebraically independent. See Kubota (1977). 

Show that for an algebraic number 6 + 0,1, the two numbers defined by 
the hypergeometric series F G. 5 1, B) and F (—5, 5 1, B) are algebraically 


independent over the rationals. See Chudnovsky and Chudnovsky (1988). 
2 


(6 


wm 


(7 


wm 


Show that if w is algebraic and 0 < |a| < 1, the theta series }°,.,)a” is 
transcendental. Recall that Liouville had shown this number to be irrational 
fora = i where / was an integer > 1. See Nesterenko (2006) in Bolibruch, 
Osipov, and Sinai (2006). 


35.6 Notes on the Literature 


See Liitzen (1990) pp. 511-526 for a very interesting history of Liouville’s work on 
transcendental numbers and some early history of such numbers. For the development 
of the theory of transcendental numbers in the 1970s and 1980s, see the articles 
in Baker (1988) and Baker and Masser (1977). Gelfond (1960) also contains some 
historical remarks on transcendental numbers. 

An English translation of a portion of Hermite’s 1873 paper may be found in Smith 
(1959) vol. I, pp. 97-106. Hilbert (1970) vol. I, pp. 1-4, is a reprint of Hilbert’s short 
proofs of the transcendence of e and z. These proofs were also presented by Felix 
Klein (1911), as lecture seven of his 1893 Evanston lectures. For the proof of the 
transcendence of e, see Klein (1911) pp. 53-55. 

Yandell (2002) gives an entertaining popular account of Hilbert’s problems and 
those who made contributions to the solutions. See also Browder (1976) to read 
articles by experts on the mathematical developments connected with Hilbert’s 
problems (up to 1975). 
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Value Distribution Theory 


36.1 Preliminary Remarks 


Value distribution theory addresses the problem of measuring the solution set of the 
equation f(z) = b, where f is some analytic function in some domain D and b is any 
complex number. For example, when f is a polynomial of degree n, the fundamental 
theorem of algebra, proved by Gauss and others, states that f(z) = b has n solutions 
for a given b. The converse of this is an easier proposition. Algebraists since Descartes 
and Harriot recognized the important property of polynomials, that if aj,a2,...,dy 
were a finite sequence of numbers, then there would be a polynomial of degree n, 
(x —a1)(x—a2)--+ (x—ay), with zeros at exactly these numbers. After Euler found the 
infinite product factorization of the trigonometric and other functions, mathematicians 
could raise the more general question of the existence of a function with an infinite 
sequence a1,d2,d3,... as its set of zeros. Of course, it was almost immediately 
understood that the product []?°. (x — a,) might not converge; in special instances 
such as the gamma function, the proper modification was also determined, in order to 
ensure convergence. Gauss and Abel treated infinite products with some care in their 
work on the gamma and elliptic functions. But the answer to the general question had 
to wait for the development of the foundations of the theory of functions of a complex 
variable. In fact, Weierstrass, one of the founders of this theory, published an important 
1876 paper, “Zur Theorie der eindeutigen analytischen Funktionen,” dealing with 
the problem. ! 

Karl Weierstrass (1815-1897) studied law at Bonn University, but after four years 
he failed to get a degree. With Christoph Gudermann as his mathematics teacher, 
Weierstrass became a Gymnasium teacher in 1841. Gudermann was a researcher in 
the area of power series representation of elliptic functions, and Weierstrass in turn 
made power series the basic technique in his work in complex analysis. His great 
accomplishment was the construction of a theory of Abelian functions; the 1854 
publication of the first installment of his theory, secured him a professorship at Berlin. 
Weierstrass was a great teacher and had many great students, including H. A. Schwarz, 


! Weierstrass (1894-1927) vol. 2, pp. 76-101. 
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G. Cantor, Leo K6nigsberger, and Sonya Kovalevskaya, whom he held in high regard. 
The Mathematics Genealogy Project counts his mathematical descendants as 16,585; 
the author would be included in that number. In order to lay a firm foundation for 
his theories, Weierstrass carefully developed the basic concepts of infinite series and 
infinite products. 

Weierstrass took a sequence {a,} such that im. |an| = 00 was the only limit point. 


Note that if c is a finite limit point of the sequence, then a nonconstant function with 
zeros at {a,} cannot be analytic at c. And since a polynomial f(x) is analytic at every 
finite x, it is reasonable to require that our function be analytic in the complex plane; 
Weierstrass called such a function an entire function and specified that (jim, |dy| = Oo. 


He observed Cauchy’s result that in the case of a finite number of zeros, an 
entire function f with zeros at a1,a2,d3,...,d, would take the form f(x) = e& (x) 
Ti-1@ — ax), with g an entire function. Note that it is now standard practice to 
denote a complex variable by z or w but Weierstrass used x. Weierstrass noted that for 
infinite sequences, one might make the product conditionally convergent by arranging 
the factors in a particular order, but this was not possible in general. As an example, 
he gave the product always divergent for x 4 0: 


+x) (14 =) (i ! see 


Now, as discussed in Chapter 17, the reciprocal of the gamma function has zeros at 
the negative integers and by Euler’s definition, attributed by Weierstrass to Gauss, 


7 ={] (it 5) (=) 


n=1 


or 


In this context, instead of In z, we use the notation log z, the logarithm of a complex 
number 2; this is a multivalued function whose principal value is such that for x > 0, 
log x = Inx. Next, as we mentioned in Section 17.5, F. W. Newman had explicitly 
observed that convergence required the exponential factor e~ m Although Weierstrass 
may not have been familiar with Newman’s paper, he wrote that the product for Tow 
directed him toward a method for achieving convergence. He realized that with each 
factor (1 + re. it was necessary to include an exponential factor 


where m, was chosen in such a way that the product converged. For this purpose, 
Weierstrass defined the primary factors 


EOO2G25) aid BOGS WEIS. 
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Since 
Lax = ele») = gD (=) for |x| <1, 
he had 
co ymtr 
E(x,m) =e72=! "mr for |x| <1. 


~ myn+l 


Thus, m, could be chosen so that for any fixed x, the series )°°° | = 


converged. In fact, note that m, = n would work, since |= = < G )” as long as 
lay | > 2|x|. Also, because (im, |a,| = ov, this inequality would be oad for all but a 


finite number of ay. Weierstrass proved that 


oe) 
x 
) log E (=m) 
an 
n=1 


converged absolutely and uniformly in any disk |x| < R, and so did the product 


n=1 a 


This product was an entire function with zeros at exactly a1,a2,qa3,.... 

The French mathematician Edmond Laguerre (1834-1886) used Weierstrass’s 
product to classify transcendental entire functions according to their genus, just as 
polynomials may be classified by their degree. He defined a product to be of genus 
m if the integer m, in each primary factor was a fixed integer m. Thus, a product 
of ae 0 is of the form [ [p21 — a while a product of genus 1 takes the form 


[[d - +)e~ an. AS we have seen in a different context, Laguerre was motivated by 
a desire ‘to extend to transcendental functions the classical results on polynomials of 
Descartes, Newton, and others. Recall that in the course of discovering his extension of 
the Descartes rule of signs, Newton showed that if a polynomial with real coefficients 


co 4 e1x + cox? + +++ +c,x" had all roots real, then 


(rt+D@—r-t Dersyicr-1 < rn =r), Pj 12. 1s 


the inequality would be strict when all roots were not equal. Note that by taking n 
infinite, we have the inequalities 


(r + L)cp4icr—1 <re’, rE1,2 3503 
Laguerre raised the question: Given a transcendental entire function f(x) 


ro nx” with all real roots and real coefficients, will the coefficients satisfy these 
inequalities? In a paper of 1882, he stated that the result was true for f(x) of genus 1. 


In the same year, he proved that, if Zt i went to zero as |x| — oo, then f(x) was of 
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genus n.” Laguerre also investigated the relationship between the zeros of a function 
of genus 0 or 1, with real zeros, and the zeros of its derivative. 

In an 1883 paper? on entire functions, Poincaré looked for a connection between 
the growth of a function and its genus p. He proved that for every € > 0, 


IFO] _ 


lim = 
x—>00 e€lxlPt! ’ 
and if 
oo) 
f(x) = ey 
n=0 
then 


1 
lim c,T (“2<*) —0 
n—>0oo pt+l 


These results suggested that in order to measure growth of a function, one required a 
concept more refined than its genus. Consider the case of a monic polynomial g(x) of 
degree n. For large x, |g(x)| behaves like |x|". So if M(r, g) is the maximum value of 
|g(x)| on |x| =r, then 

log M(r, g) 

——— =n 


roo log r 
In 1896,4 Emile Borel defined the order p of a transcendental entire function f: 


— loglog M(r, f) 
o= lim 
roo logr 


a concept implicitly contained in Hadamard’s 1893 work on the Riemann zeta 
function.» Now Riemann had introduced the entire function 


&(s) =P + sys ~ 1)n72¢(8) (36.1) 


and by a brilliantly intuitive argument obtained its product formula. Hadamard’s work 
on entire functions was motivated by the desire to provide justification for some of 
Riemann’s results. 

Jacques Hadamard (1865-1963) studied at the Ecole Normal, where his teachers 
included the outstanding mathematicians J. Tannery, Hermite, Picard, P. Appell, and 
G. Goursat. Hadamard wrote his doctoral thesis on the Taylor series of complex 
analytic functions, deriving results on the relation of the coefficients with the location 
of the singularities and with the radius of convergence. In his report on the thesis, 


2 Laguerre (1972) vol. 1, pp. 167-170, pp. 171-173. 
3 Poincaré (1883). 

* Borel (1896). 

5 Hadamard (1893). 
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Picard wrote that the abstract results appeared to lack practical value; Hermite 
suggested that Hadamard look for applications. Fifty years later, Hadamard recalled,° 
“At that time, I had none [no applications] available. Now, between the time my 
manuscript was handed in and the day when the thesis was defended, I became aware 
of an important question which had been proposed by the Académie des Sciences 
as a prize subject; and precisely the results in my thesis gave the solution of that 
question. I had been led solely by my feeling of the interest of the problem and it 
led me the right way.” The problem posed by the Académie was to prove Riemann’s 
unproved assertions. Hadamard used his result on the relation between the coefficients 
and the growth of the function to prove that the exponent of convergence of the zeros 
the function in (36.1) was at most 1. This effectively established Riemann’s product 
formula for &(s). 

Hadamard’s 1893 work implicitly contained the factorization theorem for functions 
of finite order: If f(z) is of order p, then 


f(z) = 2 P(zje2™, 


where Q(z) is a polynomial of degree g < p and P(z) is a product of genus p < p. 
Moreover, the order of P(z) is equal to the exponent of convergence / of the zeros Zn 
of P and p; < p. Note that the exponent of convergence is the infimum of the positive 
numbers a@ such that pee |Zn|_* converges. The work of Hadamard and Borel also 
implied a formula connecting the coefficients c, of the Taylor series expansion of an 
entire function with the order of the function. In 1902, this was explicitly stated by the 
Finnish mathematician Ernst Lindel6f, son of Lorenz Lindel6f (1870-1946), a student 
of Mellin, as the relation 

—_— —n logn 


~~ n->00 log ICn| : 


Lindel6df’s interest in entire functions was aroused by his contact with Hadamard 
and others when he stayed in Paris in 1893—94 and then in 1898-99. When he returned 
to the University of Helsingfors in Finland, Lindeléf communicated this interest to 
his students, including Frithiof and Rolf Nevanlinna, who made fundamental contri- 
butions to the value distribution theory of meromorphic functions. Rolf Nevanlinna 
(1895-1980) founded value distribution theory as a quantitative generalization of 
Picard’s theorem. 

Picard (1856-1941) proved that for an entire function f(x), the equation f(x) =a 
had a solution for every complex number a with at most one exception. The value of 
e* is never 0, illustrating that exceptions might exist. Picard proved this in 1879 by an 
ingenious application of the multivalued inverse of the elliptic modular function k?(r); 
note that k?(r) had earlier been studied by Abel, Jacobi, Hermite, Schwarz, and others. 
Speaking at his Jubilee celebration of 1936, Hadamard praised Picard’s teaching 
as masterly. Referring to Picard’s theorem, Hadamard addressed his teacher:’ “All 


6 Maz’ya and Shaposhnikova (1998) p. 56. 
7 ibid. p. 36. 
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mathematicians know, on the other hand, what a marvelous stimulus for research your 
mysterious and disconcerting theorem on entire functions was, and still is, because 
the subject has lost nothing of its topicality. I can say that I owe to it a great part 
of the inspiration of my first years of work.” Indeed, Picard went on to extend his 
result to functions with an essential singularity at infinity. This theorem is a vast 
generalization of the Sokhotskii—Casorati—Weierstrass theorem that every complex 
number is a limit of the values assumed by a function in any neighborhood of an 
essential singularity. 

Now in the case of a polynomial f(x) of degree n, for every a, the equation 
f(x) = a has n roots, counting multiplicity. Picard’s theorem predicts that for a 
transcendental entire function f(x), the equation f(x) = a has an infinite number 
of solutions with at most one exceptional number a. It was then natural to seek a more 
precise measure of the number of solutions, or to inquire about their density. In 1896, 
Borel proved® that, for entire functions of finite order, if f was of nonintegral order p, 
then the exponent of convergence of the zeros of f — a equaled p for all complex 
numbers a. If f was of integral order p, then the same result would hold with at most 
one exceptional value of a, in which case the exponent of convergence was less than p. 

With the development of complex function theory, attempts were made to prove 
Picard’s theorem without using elliptic modular functions. Borel, Landau, Bloch and 
R. Nevanlinna found such proofs, opening up new paths in function theory and making 
the topic among the most popular in the mathematics of the early twentieth century. In 
working with meromorphic functions, Nevanlinna’s difficulty in extending the results 
for entire functions to meromorphic functions was the lack of a concept corresponding 
to the maximum modulus of a function. Interestingly, in 1899, Jensen derived the basic 
formula for obtaining this expanded concept.’ He proved that if f was meromorphic 
in |z| <r with zeros at a; and poles at bx inside |z| < r, then 


je ied id r r 
log | FOl = 5 [ los f(e'#)1 db — Yo to 7 + Yes 
where sums were taken over all the zeros and all the poles respectively. Jensen thought 
that the formula might be important in studying the zeros of the Riemann zeta function 
and in particular in the proof of the Riemann hypothesis. In fact, Jensen’s formula 
was useful in simplifying proofs of results in both prime number theory and entire 
functions. 

In 1925, R. Nevanlinna defined the analog of the maximum modulus, the char- 
acteristic function of a meromorphic function f, as the sum of two functions:!° 
the mean proximity function, measuring the average closeness of f to a given 
complex number a; and the counting function, measuring the frequency with which 
f assumed the value a. In the same year, he went on to prove two fundamental 
theorems on the characteristic function, and his brother F. Nevanlinna recast them in a 


8 Borel (1896). 
9 Jensen (1899). 
10 Nevanlinna (1925). 
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geometric context. The latter approach was further developed and extended in 1929 by 
T. Shimizu; by Lars Ahlfors in papers of the 1930s; and in 1960 by S. S. Chern. In the 
1980s, Charles Osgood and Paul Vojta observed a close analogy between Nevanlinna 
theory and Diophantine approximation. The clarification and precise delineation of 
this analogy has had important consequences for both topics. 

It is somewhat surprising that Jacobi had already found Jensen’s result in 1827. 
Jacobi stated it only for polynomials, but as Landau pointed out, his argument can 
be extended to the general case. Jacobi used Fourier series to obtain his formula and 
was inspired by the work of Marc-Antoine Parseval (1755-1836) in Lagrange series. 
Parseval derived formulas for roots of equations in terms of definite integrals. Jacobi 
carried this program further by finding integral expressions for sums of powers of 
any number of roots of an equation, given in increasing order. Incidentally, Jacobi 
mentioned in his paper that as early as 1777, Euler had discovered the “Fourier 
coefficients;” Riemann was apparently not aware of this fact when he wrote his 1853 
thesis on trigonometric series. 


36.2 Jacobi on Jensen’s Formula 


Jacobi took a polynomial f(x) = a+ bx +cx? +--+-+.x? with real coefficients and!! 


log(U? +. V2) =9(ret¥") 4g (rewv), 


where $(x) = log f(x). He denoted the zeros of f by a’,a”,a’”,...,a?), These 
were taken in increasing order of absolute values, and Jacobi considered three separate 
cases: (i) r greater than all the roots, (ii) r less than all the roots, and (iii) r between 
a and w+). His formula is stated as 


plogr? in the first case, 
Lf loga? in the second case 
— log(U?+V7)dx = , 
De Jp 8 klogr?+log(a*t))?+log(a*+)? 4 --- + log(a?))? 
in the third case. 


(36.2) 


Summarizing Jacobi’s argument, suppose (x) = log(a + bx + cx? +++: 4 


x?), where the coefficients a,b,c,... of the pth degree polynomial are real. Let 
(x — a’)(x — a”) (x — a”) --- (x — a?) represent the factorization of the polynomial. 
Since the polynomial has real coefficients, the complex roots appear in conjugate pairs. 
It is clear that 


pret Vl) 4 ore!) 
= log {(a + br cosx + cr? cos2x +---+r? cos px)? 
+ (br sinx + cr? sin2x +--+. +r? sin px)*}. 


'1 Jacobi (1969) vol. 6, pp. 12-20. 


348 Value Distribution Theory 


Denote the expression inside the chain brackets by U* + V? so that 


.4 2P 


’ 


Cre) i 


2 | b2r2 } cert } 

on4 | 
er4 +--+), (36.3) 
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+ 2r cosx(ab+ ber? + cdr4* 


U>+V7 = 34 2r? cos2x(ac + bdr? 


+ 2r? cos 3x(ad + ber? ae f+), 
Now if f(x) =a+ bx + cx? 4 dx> +.---, then the values of the Fourier integrals 
for f (re'*) are given by 
1 TT 
at+b’r?t+ ert 4 dr? 4 — / pie) f (rem 7) dx, 
20 Jen 
1 Tw 
ab + ber* + cdr* + der® 4 = i fee) f (+ e* =) cos x dx, 
Qar Jon 
1 Tw 
ac + bdr*+ cer4 + dfr°+--- sf pee) f (re a) cos 2x dx, 
~ Inr2 


(36.4) 


Note that U2 + V7 = f(r etrv—ly f(r e*v-1) and (36.3) gives the Fourier 
expansion of this function so that the Fourier coefficients can be computed by (36.4). 


We also have 


/ 
log(r ev—le _ gy = logr ev + log (1 — ev) when r > |a’|, 
r 


= log(—a’) + log (1 —eV—1*) when |a’| > r. 
a 


The logarithms on the right-hand sides can be expanded as a series by 


1 
log —t =t+—-0 +r 4-0 H+---, t| <1. 
og ( ) 5 Z \t| < 


Jacobi used the above facts to give the series expansions for log(U* + V7): 


1. With r greater than the absolute values of all the roots: 


az 3 4 
log r? -2y" — cos.x + 5 5 cos 2x 4 “cos 3x Z cos4x +--+}. 
" 2r? 3r3 4r4 


2. With r smaller than the absolute values of all the roots: 
2 ro r4 


log a? -290 (Fome + cos 2x + = cos 3x + 5 cos4x +- -), 


where a is the constant term in the polynomial f(x). 
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3. When ja | <r < la&+)}; 


ae te at 
k logr? (2 cos x 53 08 2x 373 cos 3x4 ja ~ 


2 3 + 
oe Msg *— © cos.x — $y €08 2x — 5 cos 3x— 5 cos dx —-- 
ae 202 203 4ot ; 


where, as Jacobi explained, ~", (a) denoted the sum of the quantities 


v(a™), watt), .... wa). 


Jacobi integrated the series for log(U ? + V7) over the interval (—z,7) to 
obtain 


T 
/ log(U* + V) dx 
—T 


in the three cases. All the cosine terms vanished, and he obtained the formula 
(36.2). Jacobi also found formulas for om a” and )-? a a in terms of the nth 
Fourier coefficients of log(U 2+ V) and arctan tu. 


36.3 Jensen’s Proof 


Jensen’s 1899 rediscovery of Jacobi’s formula succeeded in connecting the modulus 
of an analytic function with its zeros, and this occurred at just the right moment to fill 
a need in the theory of analytic functions.!* Jensen himself mentioned the possibility 
of applying his result to a proof of the Riemann hypothesis. Though he apparently did 
not pursue this topic further, and though his work has not as yet made a significant 
dent in the Riemann hypothesis, his formula is fundamental for the theory of entire 
functions. His proof, similar in some respects to Jacobi’s, started with the formula 


r J oO r’ evi | en Gi | | | | 
O => — ; r= |zZ| < |d|, 
8 a ; dvi a a 


p= 


CO 
r 1 . ‘ 
= log — — ) ave ME Qver™) a lal, 
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By integrating, he obtained 


1 2 log +, for r> fal, 
=| log |! a = do =} °F ial al (36.5) 

27 Jo a 0, for r <|al. 
Next he supposed f(z) to be meromorphic in |z| <r with zeros at a1,a2,d3,...,dn 
and poles at b1,b2,...,bm in |z| < r and with no singularities on |z| = r. Then he 


could express f(z) in the form 


12 Jensen (1899). 
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f@=f olla (i-a) efi), (36.6) 
Ta (1 - £) 


where 
[oe] 
AA@ =) Byz” — for |zl <r. (36.7) 
v=1 


He took the real part of the logarithm of each side of (36.6), integrated over (0, 277), 
and applied (36.5) to get 


|b1||b2| - + + |Bm| 


|ay|la2| +++ lanl 


1 20 : 
~ | log | f (r e'”)| dO = log | f(0)| + logr”™ 
20 0 


Note that the constant term in f| is zero, and hence there is no contribution from 
the integral of /1. 


36.4 Backlund Proof of Jensen’s Formula 


The Finnish mathematician R. J. Backlund, a student of Ernst Lindelof, is credited 
with using a conformal mapping to prove Jensen’s theorem in 1916 or 1918.!° This 
proof first assumed g(z) to be analytic without zeros in |z| < R and used Cauchy’s 
integral formula to compute log g(0) as an integral. Then for a function f with zeros 


at a1,a2,...,@, in |z| < R, consider a new function with no zeros in |z| < R: 
R-@z R-mz2 R-Gz 
g(z) = f@) : ve — (36.8) 
R(z—a1) R(z-—a2) R(z-—aG) 
Since 
R* —ay 
— =1 for |2i\=R, 
R(z — ax) 
we have 


lg(Re'”)| = |f(Re’”)]. 


To fill out the details, start with Cauchy’s integral formula 


d 1 20 . 
log ¢(w) —— = — / log g(Re!) do. (36.9) 
W UT JO 


log g(0) = 5 


201 |wl|=R 


13, Backlund (1918). 
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Now apply the expression for g in (36.8) and take the real part to find that 


n 20 
IFO |R" _ 1 / log | f(Re)| dé. 
0 


|ai||a2|---|an| 27 


This proof considered only analytic functions, but if one takes the case where f(z) 
has poles at bj, b2, ...,bm, one need merely multiply the right-hand side of (36.9) by 
R@— bi) R@—bo)  R@-bn) 

—biz R?-boz  R*— baz 


to obtain the result. Although it is difficult to ascertain exactly where Backlund gave 
this proof, we note that the use of the conformal mapping 


R*—@z 
R(z—a) 


is a beautiful and efficient innovation because it vanishes at z = a and its value on 
|z| = Ris 1. 


36.5 R. Nevanlinna’s Proof of the Poisson—Jensen Formula 


Rolf Nevanlinna gave an important extension of Jensen’s formula; this result became 
the foundation of his theory of meromorphic functions.'* Suppose f (x) is a meromor- 
phic function in |x| < o (0 < ep < oo) with zeros and poles at a, (h = 1,..., 2) and 
by (k = 1,...,v), respectively. Let x = r el? f(x) £0, oo, andr < p. Then 


‘ 1 20 0 p 2_ 72 
] I = — it : dé 
og | f(re'®)| = | og |f(pe Mae — 2pr cos(0 — 6) 
v “i t 
Gn X pr — Dex 
=o +S“ log |——*=]. (36.10) 
a 2 |o— = ) do 8 cone 


Nevanlinna called this the Poisson—Jensen formula, and his proof employed 
Green’s formula 


aV aU 
/ U ——-V— as=— [ (UAV—VaAuyao. (36.11) 
i on an T 


Here U and V are twice continuously differentiable functions in a connected 
domain G with boundary I formed by a finite number of analytic arcs. The symbol A 
denotes the Laplacian and o represents the derivative in the direction normal to the 
boundary but pointing to the interior of G. 


'4 Nevanlinna (1974). 
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Take U to be a real-valued function u(z) harmonic in G UT except for logarithmic 
singularities at z = Cj, C2, ...,Cp, So that 


u(z) = Ax log |z — cx| + ug(Z), 


where A, is real, and uz is continuous at z = cg. Now let V = g(z,x), where g 
denotes Green’s function for the domain G with the singularity at an interior point 
z = x. This function is completely defined by the two conditions: The sum g(z,x) + 
log |z — x| is harmonic at all points interior to the domain G; also, g(z, x) vanishes on 
the boundary I’. Green’s formula (36.11), can be applied to U and V as chosen earlier 
if the points c, and x are excluded by means of small circles around these points. 
Then, when the radii of these circles are allowed to tend to zero after the application 
of Green’s formula, we get 


uc) = = f ig Bes (ck) (36.12) 
~ On - z 9 k §\Ck, xX). : 


n 
k=1 


Nevanlinna then took a meromorphic function f in the domain G with zeros at 
an (h =1,...,) and poles at by (k = 1,...,v). Then the function u(z) = log | f(z)| 
satisfied the conditions for (36.12) to hold so that he had 


0 g(z,x) 
on 


1 es y 
log | f(x)| = a log | f (2) ds — )~ g(an.x) + )~ g(be.x). (36.13) 
h=1 


k=1 


Nevanlinna noted that this important formula permitted him to compute the 
modulus, | f|, at any point inside G by using its values on the boundary of G and 
the location of the poles and zeros of f inside G. He took G to be a circle of radius p 
about the origin so that Green’s function would be given by 


pr —Xz 
P(x — Zz) 


g(z,x) = log | 


By substituting this g, with z = pel? and x = re’, in (36.13), Nevanlinna 
obtained his Poisson—Jensen formula (36.10). 

Furthermore, to obtain Jensen’s formula, he took x = 0 in (36.10), assuming that 
f (©) 4 0 or oo. Note that a slight modification was necessary in the cases f (0) = 0 
or oo. Thus, if 


f@)=Qqx*+engix*th +e 


’ 


and if c, 4 0, then f had to be replaced by x~* f. 
He therefore had 


v 


1 20 ; M 
log LPO = 5 | log | f(pe"”)| dé Sins ro. 36.14) 
i 


b 
won Ul 
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In his 1964 book on meromorphic functions, W. K. Hayman noted that the idea 
in the Backlund proof could be extended to yield a simple derivation of the Poisson— 
Jensen formula.!> Let g be an analytic function without zeros or poles in |z| < R. 
Now note that the mapping 


_ RE =2) 
w= 7, 
R2—Ze 


maps the disk |¢| < R conformally onto the unit disk and takes the point ¢ = z to 
w = 0. This gave Hayman 


dw dé | dé (R°=|zP) de 
wo Pag Rae (R= ZC Hz) 


so that the result of Cauchy’s theorem, 


1 dw 
log g(0) = =| log g(w) —, 
TL J\|wl=R W 
could be replaced by 


(R? — |z|?) dé 
(R?2 —Zo)(6 —z) 


= par log g(¢) 
Tl J\c|=R 


Taking real parts of this formula and setting z = re!" and ¢ = Re’?, 


1 


Qn R2—P2)d 
log RG e”)| —— / log le(Rel®)| ( Eee 
0 Re 


—2Rr cos(@ — ¢) + r2" 


20 


Following the Backlund approach, Hayman took 
(2) = f(0) Il (3) al (7) 
pa ne NRA et) 4 MRE a) ] 


and the Poisson—Jensen formula followed. 


36.6 Nevanlinna’s First Fundamental Theorem 


In his 1913 thesis,!® Georges Valiron (1884-1955), student of Borel and teacher of 
Laurent Schwartz, expressed the sums on the right-hand side of (36.14) as integrals 
by means of the counting functions n(r,0) and n(r,oo). These functions denote the 
number of zeros and poles, counting multiplicity, of f(x) in |x| < r. To efficiently 
implement Valiron’s idea,!’ Nevanlinna applied the Stieltjes integral to get 


15 Hayman (1964) chapter 1. 
16 Valiron (1913). 
!7 Nevanlinna (1925). 
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p 
Ylog 2 = i log Fdn(r.0) 


lan| 


= pee and 
0 


7 
p 
Y— log i ay SL 
Ibk| Jo r 


He could then express (36.14) in the form 


20 
log | f(O)| = xf log | f(p e!)| do [ “un” dr 4 i meee) a 
(36.15) 


Nevanlinna went on to write this in symmetric form, where he had the large values 
of the function on one side and the small values on the other. For that purpose, he set 


logt a = loga, a> 1, 


= 0, 0<a <1, 


so that for x > 0, logx = logt x — logt i. He then defined the mean proximity 
function 


1 20 2 
m(p, f) = | log* | f (pe'”)| do 


and the function 


° n(r, oo) 


dr. 


NO, f) = / 
0 


Thus, he was able to rewrite (36.15) as 


log | f(O)| = m(p, f) —m @ -) PND FAN (+2), 


or 


1 
T(p, f) =T @ >) + log | f(0)|, (36.16) 
where 


T(p, f) = m(p, f) + N(p, f). 


Note that the term m(p, f) is an average of log | f| on |z| = o for large values of 
| f|, while the term N(p, f) deals with the poles. So T(e, f) acts as a measure of the 
large values of | f| in |z| < p, while T(p, +) does the same for the small values of 
| f|. The function T(e, f) has been named the Nevanlinna characteristic function, and 
it plays a fundamental role in the theory of meromorphic functions. 
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In the preceding formulation, we considered small and large values of f. More 
generally, Nevanlinna considered values of f close and/or equal to any fixed number 


a by defining 
N(r,a) = N (~ —) = i ULI (36.17) 
f-a o ¢ 
)= ( sa)-ef | el 36.18) 
Oa ae Qn Jo cs f(re®)-a\| Ce: 
T (" a) = m(r,a)+ N(r,a). (36.19) 
f-a 


By a simple argument, he showed that 
IT(r, f) —T(r, f —a)| < log™ |a| + log2. 


Combining this inequality with Jensen’s formula (36.16), Nevanlinna arrived at his 
first fundamental theorem: 


r(n—) = T(r, f) — log | f(@) — a| + €(,a), (36.20) 


le(r,a)| < logt |a| + log 2. 


Additionally, he proved that N(r,a) and T (r, +) were increasing convex 
functions of logr. The result for N(r,a) followed from (36.17), since 


d N(r,a) 


=n(r,a). 36.21 
d logr me) ( ) 
1 
’ (fa) 
1930 Henri Cartan obtained a simpler proof by first showing that 


Nevanlinna’s proof for the convexity of T G ) was rather lengthy, but in 


Qn 
T(r, f) = = Q Nir,e!’) dé + log* | f (0). (36.22) 


Since this immediately implied that 


dT(r, f) 1 2 ; 
er ber = | n(r,e!”) dd, 


the theorem was proved. 

Note that the characteristic function T(r, f) was Nevanlinna’s analog of the 
maximum modulus log M(r, f), long sought after by complex function theorists. 
Recall that the logarithm of the maximum modulus, log M(r, f), was one of the 
essential objects in the study of entire functions; it was investigated by Hadamard, 
Borel, E. Lindelof, and others. The efforts to extend the theory of entire functions 
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to meromorphic functions required a suitable analog of log M(r, f) and Nevanlinna 
provided just that. Incidentally, in 1896 Hadamard proved that log M(r, f) was a 
convex function of log r; this is usually known as the Hadamard three circles theorem. 


36.7 Nevanlinna’s Factorization of a Meromorphic Function 


By use of the Poisson—Jensen formula, Nevanlinna was able to present a simple form 
of a canonical factorization of meromorphic functions of finite order. He gave this 
proof in the third chapter of his book, Le théoréme de Picard—Borel,'* and we outline 


his argument. Note that a function meromorphic in the complex plane is said to be of 


— Fm wsle\/ 

order p if p = Jim loge 

Nevanlinna stated the theorem: Suppose f(x) is a meromorphic function of finite 
order with zeros and poles at a1,a2,... and bi, b2, ..., respectively. Let g be a integer 
such that 

T(r) 
mare rqdtl oe 
Then 


Ke +( x ip 
 Hetce (1 _ x) ew q\a 


fe) = x% eXoow" Fee 
x x 
IT, iat = i) eta (a) 
v v 


He 


where q is an integer. 
To prove this, assuming f(0) 4 0, Nevanlinna differentiated the Poisson—Jensen 
formula g + 1 times to get 


—1)4q! —1)4q! 
Dt log f= - aM > = ae Sp (x) + Tp(x), 


(x = FT b,)@+1 


lau|<p lbu|<p 


where 


qt qt 
smend (gEa) 0B (aa) 


laul<p lby|<p 
20 i0 
pe’ do 
| | es 
[ og | f(pel”) (pei — atl 


He then showed that S,(x) and J, (x) uniformly converged to zero for |x| < r as 
p — oo. Taking this for granted, we have 


+1)! 
Ip) = 4S? 


18 Nevanlinna (1974). 


36.8 Picard’s Theorem 357 


qt! gti 1 qtl 1 gqt+l 
= (-— i = 
peresse = core in (Ee) =D (5) 


|byl<p laul<p 
Because of the uniform convergence, he could integrate g + 1 times to get 


4 q 
log f(x) =) cya! + lim ) oe (1 4 ae ae (=) | 
0 +e ay 


a 
layl<p i q 


-Ele(-seee (0) 


|by|<p 


The result follows after exponentiation, since in case f(0)=0, f(O) can be 


replaced by Le ) for a suitable positive integer a. 


36.8 Picard’s Theorem 


In his 1953 A Mathematician’s Miscellany, later published with additional material as 
Littlewood’s Miscellany, J. E. Littlewood raised and answered the question:!? whether 
a dissertation of 2 lines could deserve and get a Fellowship.” He answered in the 
affirmative, giving examples, including Picard’s theorem, for which there was a one- 
line statement and a one-line proof: 


(Theorem.) An integral [entire] function f(z) never 0 or | is a constant. 
(Proof.) exp {iQ(f (z))} is a bounded integral function. 


Littlewood explained that t = ((w) was the inverse of the modular function 
w = k?(r), arising in the theory of elliptic functions. The function k*(t) gave an 
analytic map from the half-plane {t¢€C : Imt > 0} onto C\{0,1}. Although the 
inverse Q was many-valued, for any branch of it, Q(f(z)) extended analytically to 
give an entire function from C into {te C : Imt > 0}. Note further that this argument 
implies that exp {iQ(f(z))} is a bounded analytic function; hence, by Liouville’s 
theorem, it is a constant. Therefore, f is a constant. This was Picard’s proof, but 
recall that in 1879, the study of the inverse of the modular function Q was not well 
established. So Picard used some care to prove that it was possible to define a single- 
valued branch of Q(f (z)) on the complex plane.”° Littlewood continued by imagining 
what a referee’s report could have been: 


Exceedingly striking and a most original idea. But, brilliant as it undoubtedly is, it seems more 
odd than important; an isolated result, unrelated to anything else, and not likely to lead anywhere. 


It was clearly difficult to foresee the large number of interesting developments of 
complex function theory that would arise from Picard’s theorem. 


'9 Littlewood (1986) p. 40. 
20 Picard (1879). 
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36.9 Borel’s Theorem 


Recall the Hadamard—Borel factorization theorem: An entire function of finite order 
f(z) can be written in the form z‘ P(z)e2, where Q(z) is a polynomial and P(z) 
is the canonical product constructed from the zeros of f. Note here that the order of 
the entire function P(z) is equal to the exponent of convergence of the zeros of f. 
We may deduce from this theorem, since e2©) is of integral order, that if f is of 
nonintegral order p, then the order of P(z) must be p. This in turn implies that for an 
entire function f(z) of nonintegral order p and any complex number x, the exponent 
of convergence of the zeros of f(z) — x is also p. In keeping with the notation of 
Weierstrass, Valiron, and Nevanlinna, we sometimes employ x to represent a complex 
number or variable. In 1900, Borel showed?! that for entire functions of integral 
order p, the exponent of convergence of the zeros of f(z) — x was equal to p except 
for at most one value of x. These exceptions became known as the Borel exceptional 
values. 

Outlining Borel’s proof,” first suppose a and b are two exceptional values of x. 
Then by Hadamard’s theorem 


f(Q-a=2" Pi(ze2!® and = f(z) — b = 2% Py(zje22™, (36.23) 


where Q and Q> are polynomials of degree p and P; and P2 are canonical products 
of order less than p. By subtracting the equations and multiplying by e~ 22, we have 


21 Py e222 = 7% Py + (b— a)e 22, 


The term on the right-hand side has order equal to p and hence the polynomial 
Q1 — Q>2 must be of degree p. Now differentiate the equation 


2%! Pye2! = 22 Pre22 =b—a 
to get 
(2™ PLO) + @™Pi)'je?! — (2? POs + Gry le =0. 


The coefficients of e2! and e2? are entire functions of order less than p, since the 
order of the derivative does not exceed the order of the function. So we can factorize 
these coefficients by the Hadamard—Borel theorem to obtain 


203 P3e23 e2! _ 2%4 Pye 242 =i 
where Q3 and Q4 are polynomials of degree at most p — 1, with P3 and P4 canonical 
products of orders less than p. Now rewrite the last equation as 


e21— 224+ 03-04 — zm4—a3 P4 


P3 


21 See Valiron (1949). 
22 Borel (1900). 
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The degree of the polynomial Q; — Q2 + Q3 — Q4 is pe, and hence the left-hand 
side is an entire function of order o. On the other hand, the order of the function on 
the right-hand side is less than p. This contradiction proves the theorem. 


36.10 Nevanlinna’s Second Fundamental Theorem 


In a paper of 1925, in what is now called his second fundamental theorem, R. 
Nevanlinna gave a far-reaching generalization of Picard’s theorem.”* Nevanlinna’s 
result showed that the term N (r,a) was the dominant part of the characteristic function 
and that most of the roots of the equation f(z) = a were simple. In his influential 1929 
book on the Picard-Borel theorem, he discussed his theorem. He supposed f (x) to be 


a meromorphic function and zj,z2,...,Z¢ (¢ = 3) distinct complex numbers, finite or 
not. Then 
q 
(q —2)T(r, f) < ss N(r,zv) — M(r) + SY), (36.24) 

v=1 
where 

1 

Nir) =N (" =) + (2N(r, f) — N(x, f’)) 


and where the expression S satisfied: 


1. For any positive number A, 


” S(t) " log T(t, f) 
r0 r0 
2. Moreover, 
S(r) < O (og T(r, f) + logr) (36.26) 


except for a set of finite linear measure. And if f(x) was of finite order, that is, 


— log T(r, f) 
lim ——— 


r>oo = logr 


oo, then 


S(r) = O(log r) (36.27) 


without restriction. 
The proof of this theorem is lengthy and requires the computation of several 


estimates, the most important of which shows that m (r, f) is in general negligible in 
comparison with T(r, f). For this quantity Nevanlinna proved that 


23 Nevanlinna (1925). 
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- (4) = 0 (log (r T(r, f))), 


except on a set of finite linear measure when f is of infinite order, and 


m (~ f) = O(logr), 


without restriction, when f is of finite order. 

To derive Picard’s theorem, suppose f is an entire function that does not assume 
the values a and b. Take g = 3 and z; = a, z2 = b, and z3 = o in (36.24) and 
since Nj(r) is positive, we have T(r, f) < S(r), contradicting (36.26). Thus, Picard’s 
theorem is proved. 

Nevanlinna combined the two fundamental theorems to derive an elegant extension 
of Picard’s theorem. By the first fundamental theorem 


m(r,a)+ N(r,a) 
m — 


1. 
r=00 LO.F) 
He set 
ayn i 
r=oo T(r, f) r=oo T(r, f) 


and by the second fundamental theorem 


q 


>. 8@y) <2. 


v=1 


Observe that from this, Borel’s theorem can be deduced. If a is a Borel exceptional 
value of an entire function, the reader may easily verify that 5(a@) = 1 and that 
5(0o) = O. By the preceding inequality, we know that there cannot be more than 
one exceptional value, completing the derivation. 


36.11 Exercises 


(1) Suppose that all the roots of 


f (x) = ag + ayx + anx? +--+ + anx 


are real and that @(x) is an entire function of genus 0 or 1. Suppose also that 
@(x) is real for real x and all its zeros are real and negative. Prove that all the 
roots of 


8(X) = ap9(0) + a O(1)x + +++ + an O(n) x” 


are real; in fact, that f(x) and g(x) have the same number of positive zeros and 
the same number of negative zeros. See Laguerre (1972) vol. 1, p. 201. 
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(2) Show that if f(z) = }> ayz” is an entire function of finite order p, then 


nlogn 


p = limsup . 
noo log() 


See Lindeléf (1902). 
(3) Let f(z) = }> ayz" be an entire function of finite order and let 


m(r) = max(|a,|r"), n=0,1,2,.... 


Prove that 


log M(r, f) 
im ———_ = 
r>oo logm(r, f) 
See Valiron (1949), p. 32. 


(4) Let f(z) be of finite order p and of finite type t = lim,_, 


wm 


log M( 
eM Tf 


L=limsupr °n(r, f) and 1 =liminfr-°n(, f), 
r>0o Teme 


then pee <ept. See Shah (1948) and Boas (1954) p. 16. Swarupchand 
Mohanlal Shah (1905-1996) received his appointment at Aligarh Moslem 
University (India) from André Weil, who served there as department head 
from 1931 to 1933. In 1942, Shah received his Ph.D. from the University of 
London under Hardy’s student Titchmarsh. Returning to Aligarh, Shah served 
as head of the department from 1953 to 1958 when he reached the mandatory 
retirement age in India; he then took up a second mathematics career in the 
United States. He taught for more than twenty years in the United States, 
at Kansas and at Kentucky. Shah published hundreds of papers in complex 
analysis and gave a boost to a number of young mathematicians by encouraging 
them and collaborating with them. 


Show that if a ~ Oand f(x) = a+a,x-+--- is analytic at the origin, then there 
is a number L, depending only on a and ay, such that if f(x) is analytic in the 
disk |x| < L, then f(x) must take the value 0 and/or 1 somewhere in the disk. 
See Landau (1904). In 1905, Constantin Carathéodory found an expression for 
L in terms of the fundamental branch of the inverse of the elliptic modular 
function. Carathéodory used what is now called Schwarz’s lemma, a result he 
extracted from Schwarz’s work; he showed its importance, thereby elevating it 
to the status of an important lemma. Georg Pick then generalized this lemma. 


(5 


wm 


(6) Suppose f(z) is meromorphic and has only a finite number of poles, and that 
ff), f (z) have only a finite number of zeros for some / > 2. Show that then 


_ Pi (z)eP2®) 
{= Bo” 
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with P;, P2,P3 polynomials. Furthermore, if f(z) and f O (z) have no zeros, 
then either f(z) = e4<+B or else f(z) = (Az + B)™. This result is due to 
J. Clunie. See Hayman (1964) p. 67. 

Let f(z) be a meromorphic function of order p, where 0 < p < 53 d(a, f) > 0 
when p = 0 and d(a, f) => 1 —coszp when p > 0. Show that then a is 
the only deficient value of f(z); in particular, a meromorphic function of order 
zero can have at most one deficient value. This result is due to the German 
mathematician Oswald Teichmiiller (1913-1943) for functions with positive 
poles and negative zeros; to the Russian mathematician A. A. Goldberg for the 


general case. See Hayman (1964) p. 114. 


(7 


wm 


36.12 Notes on the Literature 


For references to works on entire functions, the reader may consult Borel (1900), 
Valiron (1949), and Boas (1954). For Nevanlinna theory, see Nevanlinna (1974), 
a reprint of his 1929 book, and Hayman (1964). Neuenschwander (1978) gives 
a history of the Casorati—Weierstrass theorem; Picard’s theorem is a far-reaching 
refinement of this theorem. Littlewood’s witty comments on Picard’s theorem first 
appeared in A Mathematician’s Miscellany. Littlewood (1986) was put out by his 
friend Béla Bollobas who also wrote a twenty-two page foreword; it contains a reprint 
of Littlewood’s 1953 book along with photographs and some additional material by 
Littlewood. 

See Cherry and Ye (2001), M. Ru (2001), and Bombieri and Gubler (2006) for 
treatments of the remarkable analogy between the Diophantine equations and value 
distribution or Nevanlinna theory. This parallel has been worked out in some detail 
and has led to significant advances in both areas. 
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Univalent Functions 


37.1 Preliminary Remarks 


Weierstrass constructed a theory of functions using power series as the basic object, 
in contrast with Riemann, who studied analytic functions as mappings, specifically 
conformal mappings. The Bieberbach conjecture was rooted in these dual aspects 
of analytic function theory; it simultaneously viewed a function as a mapping and 
as a series. Thus, Bieberbach considered conformal mappings, such as those studied 
by Riemann, and then speculated on the magnitude of the coefficients, assuming the 
first two to be zero and one, respectively. A function f analytic in a domain D, an 
open and connected subset of the complex plane, is called univalent in D if it does 
not assume any value more than once. A univalent function # maps D conformally 
onto its image domain f(D). Riemann was the first to study conformal mappings in 
the context of complex function theory. In his 1851 doctoral dissertation, he stated 
his famous theorem,! now called the Riemann mapping theorem, that any simply 
connected proper subdomain D of the complex plane could be conformally mapped 
onto the unit disk |z| < 1. Note here that the mapping must be one-to-one and analytic. 
This mapping f is unique if we require that for a given point zg in the domain D, 
f (zo) = O and f’(zo) > O. Observe that since the inverse of a univalent function 
is also univalent, it is of interest to consider functions univalent on the unit disk. We 
denote by S the set of normalized univalent functions on the unit disk, that is, univalent 
functions for which f(0) = 0 and f’(0) = 1. The Taylor expansion of f would take 
the form 


F(Z) =z tage” 4agz2 tes tagz? tes, (37.1) 


In a paper of 1916,” Ludwig Bieberbach (1886-1982) proved that |a2| < 2 and 
then, in a footnote, conjectured that |a,| < n. Attempts to prove this conjecture led 
to valuable developments in the theory of analytic functions of one variable, lending 
it additional significance. Louis de Branges’s 1984 proof of this conjecture concluded 


! Riemann (1990) pp. 35-75 or Riemann (2004) pp. 1-39. 
2 Bieberbach (1916). 
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an era in the theory of functions, comparable, albeit on a smaller scale, to the 350- 
year era in number theory brought to an end by Andrew Wiles’s 1994 resolution of 
Fermat’s problem. 

Riemann gave a sketch of a proof of his mapping theorem, but in 1871 his student, 
F. Emil Prym, found a flaw in his line of reasoning, even apart from Riemann’s use of 
an unproved variational principle rigorously established by Hilbert only a half century 
later. Note that George Green had made use of this principle in his famous work of 
1828 on electricity and magnetism.’ In spite of its shaky foundations, however, the 
significance of Riemann’s mapping theorem was immediately recognized. In 1867, 
Dirichlet’s student Elwin B. Christoffel (1829-1900) showed that the upper half 
plane could be conformally mapped onto polygonal regions by means of functions 
defined by integrals.+ Note that the upper half plane may be mapped onto the unit 
disk by a fractional linear transformation. About two years later, Christoffel’s result 
was independently rediscovered by H. A. Schwarz. In the 1870s, Carl Neumann and 
Schwarz used potential theoretic methods to prove the mapping theorem for regions 
bounded by analytic arcs. In the years around 1900, Hilbert brought renewed attention 
to the Riemann mapping problem and its generalization, the uniformization theorem, 
with the statement of his twenty-second problem? and with his proofs of the Dirichlet 
principle. 

In 1907, Paul Koebe (1882-1945) and Henri Poincaré proved the uniformization 
theorem that every simply connected Riemann surface was conformal to one of the 
three: the unit disk, the complex plane, or the extended complex plane.° Poincaré’s 
work was a continuation of methods and ideas he had developed in the early 1880s, 
when he established the theory of Fuchsian and Kleinian groups and the related 
theory of automorphic functions. Felix Klein played an equally important role in this 
development. In fact, Klein and Poincaré corresponded regularly in 1881-82 while 
creating these theories by differing approaches and techniques. In one of his proofs 
of the uniformization theorem, Koebe showed that the set S of normalized univalent 
functions was a normal family. Now a family F of analytic functions defined on a 
domain D is called normal if every sequence of functions f, in F has a subsequence 
converging uniformly on each compact subset of D. The concept of a normal family 
is due to Paul Montel (1876-1975), a student of Borel and Lebesgue. In a June 1935 
letter to Zermelo, Carathéodory discussed the history of this concept:’ 


The word and the notion “normal family” comes from Montel, who had shaped it around 1904. 
This notion has emerged from a further development of the Weierstrass double-series theorem 
stemming from Stieltjes (around 1895). If one notes that for all analytic functions f(z), which 
are regular for |z| < 1 and satisfy the condition | f(z)| < 1 there, all coefficients of the power 
series dg + a,z+---= f(z) are uniformly limited, it follows that from every set { f(z)} of such 
functions one can choose a uniformly convergent sequence on every circle |z| < r < 1. This led 
Montel to give the name “normal families” to all sets of functions which possess an analogous 


3 Green (1970) pp. 23-41. 

4 Christoffel (1867) p. 97. 

5 Yandell (2002) p. 417. 

© Koebe (1907-1908) and Poincaré (1907). 
7 Georgiadou (2004) p. 82. 
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property. So, one was able to show that all functions which are regular in a domain G and are 
# 0, 1 constitute a normal family; the Picard theorem follows on from here easily. The notion 
of the limiting oscillation which allows us to speak of families that are normal in a point comes 
from me. 


Constantin Carathéodory (1873-1950) was a German mathematician of Greek 
descent. He initially studied engineering at the Military School of Belgium and 
was involved with the construction of the Assiut dam in Egypt. Abandoning his 
engineering career due to an increasing attraction to mathematics, Carathéodory 
attended H. A. Schwarz’s Berlin colloquia; he received his doctoral degree from 
Gottingen in 1904 under Minkowski for a thesis on the calculus of variations; his 
peripatetic career was then spent at a number of institutions in Germany, Greece, 
and the United States. Hans Rademacher was his 1916 student at Gottingen. 
Carathéodory’s interest in function theory was aroused when Pierre Boutroux, nephew 
of H. Poincaré, visited Gottingen in 1905. Boutroux was then trying to simplify 
E. Borel’s recent proof of Picard’s theorem on entire functions and discussed this 
problem with Carathéodory. In his autobiographical notes, Carathéodory recalled this 
encounter:® 


Boutroux had noticed that his proof was successful only because in the case of conformal 
mappings there was a remarkable rigidity, which, by the way, he was not able to put into formulae. 
Boutroux’s discovery did not let me rest and six weeks later I was able to prove Landau’s 
sharpening of the Picard theorem in a few lines by using the theorem which is today called the 
lemma of Schwarz. I produced this theorem with the help of Poisson’s integral; only through 
Erhard Schmidt, whom I had informed of my findings, did I learn not only that the theorem 
already exists in the work of Schwarz, but also that it can be gained by absolutely elementary 
means. Indeed, the proof, which Schmidt informed me about, cannot be improved. Thus, I gained 
a further field of activity apart from the calculus of variations. 


Schmidt’s proof of Schwarz’s lemma, a form of which was used by Schwarz in 1869 
for his proof of the Riemann mapping theorem, is the one usually found in complex 
analysis textbooks. It was Carathéodory, however, who revealed the importance of 
the lemma by giving several significant applications of it. It is due to his efforts that 
Schwarz’s lemma and its generalizations became so useful in complex function theory. 

In his important 1912 paper, Carathéodory applied Schwarz’s lemma to prove 
a result on kernel convergence, a key concept within geometric function theory.” 
Suppose G1, G2,..., Gn,... is an infinite sequence of simply connected domains in 
the complex plane, containing the origin but not coinciding with the whole complex 
plane. Suppose also that f,,(z) is a conformal mapping of the unit disk onto the 
domain G,, with f,(0) = 0 and f/(0) > 0. Carathéodory’s theorem related the 
geometric behavior of the domains G, with the analytic behavior of the functions 
Jn; this result was later employed by Lowner (Loewner) to develop his parametric 
method for the study of univalent functions. Carathéodory applied it to determine 
the boundary behavior of conformal mappings. As he wrote in his letter to Hilbert 
in connection with this theorem, “A first application of this theorem is, for instance, 


8 ibid. p. 63. 
9 Carathéodory (1912). 
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the proof of continuity of the conformal mapping as a function of its boundary, even 
if the boundary is a non-analytic curve and the Cauchy theorem cannot be applied.” 
To state Carathéodory’s convergence theorem, first suppose the origin is an interior 
point of () Gy; then the kernel of the sequence {G,,} is defined as the largest domain 
G containing the origin such that each compact subset of G is contained in every 
Gy, with the possible exception of a finite number of G,,. Note that it is easy prove 
that G exists. Next, if the origin is not an interior point of () G,, then the kernel is 
defined by G = {0}. The sequence {G,,} is said to converge to the kernel G if every 
subsequence of {G,,} has G as kernel. When convergence occurs, either G = {0} or 
G is simply connected. Also, let {f,} be a sequence of univalent functions on the 
unit disk with f,(0) = 0 and f, > 0; moreover, let f, map the unit disk to Gy. On 
this basis, the theorem states that a sequence of functions {/,} converges uniformly 
on compact subsets of the unit disk to a function f if and only if {G,} converges 
to the keel G + C-. If convergence occurs, then either G = {0}, in which case 
f = 0, or G F {0}, in which case f is a conformal mapping from the unit disk to G. 
To prove this theorem, Carathéodory used Schwarz’s lemma combined with Koebe’s 
one-quarter theorem. The latter theorem was not fully proved until Bieberbach did so 
in 1916; for Carathéodory’s convergence theorem, Koebe’s weaker result, for some 
positive constant not necessarily i was sufficient. 

Carathéodory’s results were used a decade later by L6wner to construct his para- 
metric theory of univalent functions. The Czech mathematician Karel Lowner (1893-— 
1968) was a student of Georg Pick, and his name was later spelled Karl Lowner and 
then, after emigration to America, Charles Loewner. He studied in the German section 
of the University of Prague, writing his thesis in 1917 on convex conformal mappings 
under the direction of Pick, who himself did notable work in complex analysis. Pick’s 
invariant form of the Schwarz lemma appears in several books on geometric function 
theory. Note that Pick was a student of Weierstrass’s student Konigsberger. Lowner’s 
thesis contained interesting results on the growth of convex univalent functions 
and their derivatives. He also proved that the Bieberbach conjecture would hold for 
the subclass of convex univalent functions, and in fact, |a,| < 1. In an important 
paper of 1923,!° Léwner developed a powerful method for dealing with the class 
of univalent functions. Bieberbach was very impressed by this method and inserted 
“T’ (.e., part I) in the title of L6wner’s paper, implying that Lowner should work 
further in this area; unfortunately, Lowner did not return to the coefficient problem. 
In his paper, he defined a subset S; of S consisting of single slit mappings, univalent 
functions mapping the unit disk onto the complex plane minus one analytic Jordan arc 
extending to infinity. Using Carathéodory’s theorem, he showed that S$, was a dense 
subset of S in the topology defined by uniform convergence on compact subsets. Next, 
he proved that any function in S; could be obtained from the identity mapping by a 
series of successive infinitesimal transformations. He gave a fairly simple differential 
equation to effect this transformation; in fact, he gave two forms of this equation, one 
of which he himself used to prove that |a;| < i for i = 2, 3; the other form was used 
by de Branges to derive the complete result. 


10 Loewner (1988) pp. 45-64. 
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An alternative approach to the coefficient problem for univalent functions, using 
area inequalities, was initiated in a 1914 paper!! by the Swedish-American mathemati- 
cian Thomas Hakon Gronwall (1877-1932); Bieberbach’s work was an independent 
discovery of the same idea. This method would play a part in the development 
of univalent functions and in the proof of the Bieberbach conjecture. Gronwall 
received his doctor’s degree in 1898 under Mittag-Leffler, but also learned from 
mathematicians such as H. von Koch, I. Fredholm, and E. Phragmén. He received an 
engineering degree from Berlin in 1902 and then worked at various steel works in the 
United States. In 1912 he returned to his first love and in the next two years published 
almost two dozen papers ranging over the topics of Fourier series, analytic functions, 
conformal mappings, and special functions. Consequently, he was invited to Princeton 
as an instructor in 1913, and was promptly promoted. Gronwall soon left Princeton to 
take up a number of other pursuits, but not before J. W. Alexander (1888-1971), the 
famous topologist, had completed his thesis on univalent functions under him. In fact, 
Alexander had been a protégé of O. Veblen (1880-1960) and had already published 
couple of papers in topology when Veblen suggested that he do his thesis in analysis 
under Gronwall. Apparently, Veblen feared that topology might be a passing fad! 

In 1916, Bieberbach rediscovered one of Gronwall’s area inequalities and employed 
it to prove his theorem on the second coefficient in the Taylor expansion of a 
normalized univalent function.!? In this paper, he also obtained a result on the growth 
of a univalent function; he later used this to prove that |a,| = O(n”). Then in 
1923, Littlewood improved on this, showing that the order of the nth coefficient 
had to be n. Seven years later, Littlewood made another significant contribution to 
this topic in collaboration with his pupil Paley. R. E. A. C. Paley (1907-1933), 
graduated from Trinity College in 1929. He wrote his dissertation under Littlewood on 
nondifferentiable functions and was elected to a Trinity Fellowship in 1930. He was 
quickly blossoming into one of the leading British mathematicians of his generation 
when his life was cut short by a skiing accident in the Rocky Mountains. In his 
very brief career, he published almost thirty papers in several aspects of analysis 
and collaborated with such outstanding mathematicians as Littlewood, N. Wiener, and 
A. Zygmund. Littlewood and Paley proved that the coefficients of any odd univalent 
function in S are bounded by a constant independent of the function.!? More precisely, 
for all F € S, and 


F(z) =zt+0o32 +e52 +-°-, (37.2) 


there exists an absolute constant A independent of F such that |con4i| < A. Ina 
footnote they observed, “No doubt the true bound is by A = 1.” This conjecture 
makes sense in light of an earlier result of I. I. Privalov, that A = 1 for odd starlike 
functions. A set E C C is called starlike with respect to a point wo € E if the line 
segment joining wo to every point w € E lies entirely in E. A starlike function is a 
conformal mapping of the unit disk onto a domain starlike with respect to the origin. 


‘1 Gronwall (1914). 
12 Bieberbach (1916). 
13 Littlewood and Paley (1932). 
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This conjecture implied the Bieberbach conjecture, but it was proved false in 1933 by 
the Hungarian mathematicians M. Fekete and G. Szegé.'4 They used Léwner’s theory 
to establish that 


1 
les] < 5 +e7} = 1.013..., (37.3) 


and that the inequality was sharp. A modification of the Paley-Littlewood conjecture 
was suggested by a result of the French mathematician Jean Dieudonné (1906-1992). 
Dieudonné, a founding member of the Bourbaki group, proved in 1931!> that if an 
odd univalent function is real on the real axis, then 


lcan—1| + leanti] <2, and = |c3| <1. (37.4) 
Then in 1936, M. S. Robertson applied the method of Fekete to prove that 
le3| + les] < 2, (37.5) 


even when F(z) was not real on the real axis.!® Combining this with Dieudonné’s 
result, Robertson conjectured that the Littlewood-Paley conjecture was true on the 
average for an odd univalent function: 


n 
SS lexi Sa. (37.6) 
k=1 


Observe that this implies the Bieberbach conjecture: If f € S is given by (37.1), and 


the odd function F(z) = (f (22))2 by (37.2), then the relation between the coefficients 
of these two functions is given by 


An = C1Can—1 + 03Can—3 +++ + Crmm-1e1, n=l. (37.7) 


De Branges’s proof of the Bieberbach conjecture in actuality demonstrated a more 
general result, Milin’s conjecture; this concerned the logarithmic coefficients of the 
univalent function. Thus, de Branges’s method did not directly yield Bieberbach’s con- 
jecture. We note that logarithmic coefficients in connection with univalent functions 
were first considered by Helmut Grunsky (1904-1986). Grunsky was an excellent 
analyst with a long and varied career; Ahlfors remarked that his thesis on extremal 
problems in conformal mappings was “a truly remarkable piece of work.”!” In 1939, 
while at Berlin, Grunsky showed!® that an analytic function 


by 
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14 Fekete and Szegé (1933). 

15. Dieudonné (1931). 
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in a neighborhood of oo would extend to an injective and analytic function in the disk 
|z| > 1 (ue. g © &), if and only if its Grunsky coefficients, defined by 


log a - SS bys cl (37.8) 


a k=1 /=1 


satisfied the Grunsky inequalities 


oo 2 
ay (37.9) 


k=1 k 


where {x;,} was a sequence of complex numbers. Clearly, the Grunsky coefficients 
provided a characterization of the property of univalence. Grunsky’s proof of the 
theorem employed contour integration and was not difficult, although the expressions 
of the Grunsky coefficients by; in terms of the coefficients by of g were very 
complicated. Perhaps this is one reason that the effectiveness of Grunsky’s inequality 
was not noticed until around 1960 when it was used by Z. Charzynski and M. Schiffer 
to reprove the result!? that |a4| < 4. In 1955, Schiffer and P. R. Garabedian had already 
proved? |a4| < 4 by means of a powerful variational technique developed by Schiffer 
in the 1930s. Soon after the work of Charzynski and Schiffer, a generalization of 
Gronwall’s area theorem in terms of the Grunsky coefficients was noted by a number 
of mathematicians, including J. A. Jenkins, Milin, and C. Pommerenke. Schiffer had 
already made this observation in 1948. For Pommerenke’s formulation, let g € &, and 
let x1, x2,..., Xj be complex numbers not all zero. Then 


“1 
=); ype! (37.10) 


where equality holds if and only if the area of C\g(|z| > 1), that is, the complement of 
the image of |z| > 1, is zero. Note that when m = 00, (37.10) and (37.9) are equivalent. 

In a 1964 paper,”! I. M. Milin applied the area method to study the properties of 
{An(f)}, defined by 


2-6 
——_——_— = An ; 37.11 
CFD FO a5 (Gham (37.11) 


n=1 

where F € X. Soon after this, I. E. Bazilevich worked directly with log (£2 @) and 
proved an interesting inequality about its coefficients. In his account of the motivations 
behind his conjecture, Milin wrote,?* “In this way I developed the conviction that 
the property of univalence reveals itself rather simply through area theorems or 
other methods in the form of restrictions on the coefficients of the logarithmic 


19 Charzynski and Schiffer (1960). 
20 Garabedian and Schiffer (1955). 
21 Milin (1964). 
22 Milin (1986). 
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function (37.11) and log fo = 2°” | mz", and that it is necessary to construct an 
‘apparatus of exponentiation’ to transfer the restrictions from logarithmic coefficients 
to coefficients of the functions themselves.” 

It was with this in mind that in 1965, N. A. Lebedev and Milin worked out the 
exponential inequality:?> If ae, Axz* is an arbitrary power series with positive 
radius of convergence and 


[oe] [o,2) 
exp (>: ac) = » Dig 
k=1 k=0 


then 


n—-1 n—-l1 v 
2 7 es 
S > |Dil cre {2Y (x14 re :)}- (37.12) 


k=0 v=lk= 


Now note that if we write (2)? = DV con4127"7!, then 


(72); (5 fo) 
= exp( — log —— 
z 2 


implies that for y,, as defined in Milin’s quotation, 


(oe) CO 
ea = exp {>> re ; (37.13) 


n=0 n=1 


Applying the Lebedev—Milin inequality, we obtain 


n—1 gees 1 
Yo lees? <neip| 2°)" (sini :)}- (37.14) 
k=0 


v=1 k=1 


Milin observed this inequality in 1970 in the course of writing his book on univalent 
functions.** He perceived that if the inequalities 


n Vv 1 
> (ein?-z) <0, n=1,2,3,..., (37.15) 


v=1 k=1 


were true, then Robertson’s conjecture (and hence Bieberbach’s conjecture) followed 
from (37.14). For Koebe’s function f(z), given by (37.25), |yg| = i so that equality 
holds in (37.15). Milin did not state (37.15) as a conjecture in his book, although he 


23 Lebedev and Milin (1965). 
24 See Milin (1986) p. 111. 
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had evidence to support it. For instance, inspired by a result of Pommerenke, Milin 
obtained the equality 


n n 
1 
Wires S28 820512 (37.16) 


It was in a 1972 paper? that A. Z. Grinshpan, with the approval of Lebedev and 
Milin, referred to inequality (37.15) as Milin’s conjecture. 

Lebedev, Milin, E.G. Emelyanov, and Grinshpan were members of the Leningrad 
or St. Petersburg school of geometric function theory. In 1984, these mathematicians 
and other members of the Leningrad Seminar joined together and exerted considerable 
effort to reformulate in classical form de Branges’s proof of the Milin conjecture, 
making it more accessible to the community of geometric function theorists. The 
Leningrad school was founded by G. M. Goluzin (1906-1952). Goluzin entered 
Leningrad University in 1924 and remained there in various capacities until his death. 
He was appointed professor of mathematics in 1938, and from then on he led the 
seminar and built up a school of function theorists. Goluzin made major contributions 
to the theory of univalent functions and developed a variation on Schiffer’s technique 
of interior variations, applying it to several problems and deriving a number of 
deep results. In an early paper, he applied L6wner’s parametric method to obtain a 
sharp bound on |arg f’(z)| for f € S. Another easily stated theorem of Goluzin is 
that |a,| < zen, an improvement on Littlewood; of course, he derived the Goluzin 
inequality: If 

g € D, 2, lie in the set |z| > l andy, € C,v = 1,2,...,n, then 


n 


. g(Zu) —8(av)| ware 1 
Ye <)>\ yylog ———. 7.17) 
[bo Z 


p=lv=1 p=l v=1 a (ZyZv) 


We observe that this inequality can be derived from Grunsky’s; conversely, this 
implies Grunsky. In 1972, the American mathematician C. H. FitzGerald exponenti- 
ated Goluzin’s inequality to obtain what is now called FitzGerald’s inequality, from 
which he derived several coefficient inequalities.2° For example, he showed that 
|an| < fin < 1.081, and in 1978, D. Horowitz, using the same method, made an 


improvement,”’ obtaining 


1,659, 164, 137 
eel Naa 


1 

Ta 
~ 1.0657 n. 
ae) peso 


Until de Branges, this was the best result on the Bieberbach conjecture for all n. 


25 Grinshpan (1972). 
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37.2 Gronwall: Area Inequalities 


In his 1914 paper, Gronwall derived results on the growth of a univalent function and 
its derivative.28 These depended on the measure of the area of the image of a disk 
under the conformal transformation given by the univalent function. Gronwall gave 
two main applications of this idea. In the first, he assumed f(x) = °° anx” to be 
univalent and the area of the image of the unit disk under f to be at most A. Then for 
|x| <r < 1, he showed that 


A 1 
If()| < [4 ioe i 


He used the change of variables formula for an integral to conclude that the area 
A(r) of the image of the disk |x| <r < 1 was 


r Qn oo 
A(r) 2) ap | If’ (pe)? pd0 = 2 nlan|?r?”. (37.18) 
0 0 ven 
Note that 
; (oe) 
f' (pel) = Yo nan p"' 
n=1 


and term by term integration is possible because of absolute convergence. From 
(37.18) Gronwall concluded, after letting r tend to one, that 


[o.2) 
mY nlan|? < A. (37.19) 
n=1 
Using (37.19), 


FOOLS la, 


n=1 


and the Cauchy—Schwarz inequality, he found the required result: 


VOl2 yaar ys =F ise! 
Giosr is pare ot Pag? 


For the second application, Gronwall considered a function 


1 CO 
foe - + a 
n= 


28 Gronwall (1914). 
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where, without the term i, the series converged for |x| < 1. Such series had been 
discussed earlier, but Gronwall derived an important inequality for them, called the 
area theorem. For this purpose, it is convenient to let z = + and consider the class & 
of functions 


b 
g(z)=z+bo4 Pap et sg (37.20) 


analytic and one-to-one in |z| > 1 except for a simple pole at oo with residue 1. For 
these functions, Gronwall proved that 


oe) 


So nlbal? <1. (37.21) 


n=1 


To verify this, he again applied the area method. He did not give all the details, 
but one may apply Green’s theorem to see that if the closed curve C, is the image of 
|z| =r > 1 under g(z), then it encloses a positive area 


1 20 = ; . 
0 <; | g(re!9) g'(re!®)re!® do 
0 


e (37.22) 
= {- — Sniirm| : 


n=1 


The necessary inequality follows by letting r > 1. 


37.3. Bieberbach’s Conjecture 


In 1916, apparently unaware of Gronwall’s earlier work, Bieberbach reproved the 
area theorem and deduced his inequality for the second coefficient of functions in the 
set S of normalized univalent functions.2? To achieve this, he used an idea he called 
Faber’s trick: Supposing f is a function in S, then F(z) = (f (z))2 is an odd univalent 
function. To prove this, observe that f(z) vanishes only at z = 0 and hence a single 
valued branch of the square root can be chosen in 


1 


F(z) =z +apz7 +4324 +---)2. 


Clearly, F(z) is odd. It is univalent because if F(z1) = F(z2), then f (zi) = hieaay 
moreover, the univalence of f(z) implies z} = +2. If z1 = —z2, then F(z1) = 
F(z) = —F(z). This implies F(z) = 0 or z; = 0, proving the result. To apply the 
area theorem, Bieberbach noted that 


1 
F@)=z+ 5a2°+--, 


29 Bieberbach (1916). 
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and used F'(z) to construct a function g(z) in the class =: 


F(2) 2 


z 


1 ‘ae Sa ed 
sZ= 7 =2 a2- bees =zt)o byz ‘ 
n=1 


Hence, by (37.21), he obtained |b;| < 1 or |a2| < 2. Ina footnote, Bieberbach went 
on to conjecture that |a,| < n for the coefficients of a normalized univalent function 
on the unit disk. He was able to verify Koebe’s conjecture that the image of the open 
unit disk under any f € S would always contain a circle of radius i with the origin 
as center. More precisely, he showed that if f(z) 4 w in |z| < 1, then |w| > i (an 
improvement on Koebe). This was an immediate corollary of the bound for az. For if 


f(z) # w, then 


wf 1 
Ae a2 4 ee S 
w — f(z) w 
and hence 
1 
ay+—| <2, |) $2+ aq] <4 or |wl 2 7. (37.23) 
w 
The example (often called Koebe’s function) 
z 
alr es tea aa a i (37.24) 
or more generally 


shows that |a2| = 2 actually occurs for functions in S. We can also write (37.24) as 


Lyf tee g\?- i 
w=f0=7(7*%) oe 


From this representation, it is easy to see that f(z) maps |z| < 1 conformally onto 
the w-plane cut from —i to —oo along the negative real axis. Note that f(z) # — 
in |z| < 1. Moreover, because fg(z) = e7!? f(ze!®), this function maps |z| < l 
conformally onto the w-plane cut radially from — ze! 246-008, 

In this paper, Bieberbach obtained another important result on the growth of a 


normalized univalent function f(z): 


<F(z) | = IZjJ=r, O<r<l1). (37.26) 


r r 
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37.4 Littlewood: |a,| < en 


In 1923, Littlewood proved that Bieberbach’s conjecture was correct up to the order 
of magnitude. His paper with the result given in the title of this section appeared in 
1925.°° Littlewood derived his inequality for the coefficients a, from the inequality, 


1, pee r 
=| | f(re!’)| dé < ao O<r<l, (37.27) 
T Jo —PLr 


where f € S. He considered the univalent function 


$@) =(F@)2 =zt be+-- 


and by (37.26) concluded that 


t 
1-7? 


Ip(te!”)| < 


This result in turn implied that ¢ transformed |z| < t < | to a region whose area 
A(t) was less that 


2 : aes : 
( = ; Dr: He combined this with the equation 


oO t Qn ; 
mY n|bp|?t" = / rar [ |p’ (re!®)|? dd = A(t) 
a 0 0 


to derive the inequality 


oe) 


t 
2,2n—1 
dinlbul?er! < a 


n=1 
Integrating from 0 to r < 1, he obtained 


(oe) 


2.2 r? 
) by|-r" < : 
|bn| fs 2 


n=1 


He next observed that the series on the left-hand side of this inequality was 
given by 


1 Qn ; 1 Qn 4 1 Qn : 
ae fh were a0 = fo iroreyiao = [fete ay. 
2x Jo 2m Jo 2m Jo 


At this point, to derive the necessary result for a,, Littlewood could apply Cauchy’s 
formula, with r = 1 — ‘, to obtain 


30 Littlewood (1925). 
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2n 
ferae| < i |f(re'*)| do < 


~ Qnr" 


1 n—-1 
< (1+) n<en. 
n—-1 


In the same paper, Littlewood also showed that if M(r, f) denoted the maximum 


of | f(z)| on the circle |z| = r, and f was univalent, then for 7 > 5. 


1 
lan| = = 


20 


[z|=r re-l(. —r)’ 


(37.28) 


Lo 
xc f_lcre*)d0 = Arp* A py". (37.29) 
—H 


37.5 Littlewood and Paley on Odd Univalent Functions 
Littlewood and Paley stated their main theorem of 1932:°! If 
f@ =z +4323 +452 ++: 


is an odd univalent function, then there is an absolute constant A, such that |a,| < A. 
To prove this result, they used Bieberbach’s growth theorem for univalent functions, 
an inequality from Littlewood’s 1925 paper, and a new inequality given by 


1 aL ; 
= } lo’ (pe!) (2d0 < Cp-!(1 — p)~!M2(p3,0), (37.30) 
T J_x 


where C denoted a constant and o was in S, the set of normalized univalent functions. 
In their proof of this, they assumed that o(z) = z + coz +. c3¢3 +--+ and applied 
Gronwall’s formula to conclude that the area of the image of |z| < p under o was 
given by 


mY inca? po" < 1M*(p,0). 
Using this they arrived at the required result: 
2a) } 7 |en|?p" < 2 Max(np") ) nlen|*o" 
A 
< *_AM?(p2,0), 
Lip 
for some absolute constant A. We note that in Littlewood and Paley’s paper, every 
absolute constant was denoted by the same symbol, A. We shall follow their 


convention. They constructed two other univalent functions related to f(z), defined 
by the relations 


31 Littlewood and Paley (1932). 
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$(2) =(FW/D) =z $2032? +--+, 


1 1 
v@=(f@))% =c+ 302 ae aS 


They proved the univalence of ¢(z) by noting that ¢(z) = w implied f(./z) = 
+./w. They reasoned that since f was odd, only a pair of equal and opposite values 
were possible for ./z, and hence only one value was possible for z. They also used a 
simple argument to demonstrate y(z) to be univalent. 

To prove their theorem, Littlewood and Paley applied Cauchy’s theorem to the 
coefficients of f’(z): 


20 


pol 4 ” 
\nan| < i | f'(pe'")| da. 
TT 
Thus, it was sufficient for them to show that 


/ If’ (pel®)| do < — (37.31) 
—p 


Kt 


They noted that by combining (37.31) with the inequality for nay, p = 1 — , they 
obtained the required inequality for a). To prove (37.31), they observed that since 


f@= W(z3), it followed that for z = pe’®, 
1 sa 1 3 ee 30 ; 
=| If (@)|d0 = al | f'(@)|d0 = es IW? (23 (23) 8. 
20 Jn 6m J_37 6m J_37 
On applying the Cauchy—Schwarz inequality to the last integral, they found 


1 
a 30 2 30 
all Lf (@ld0 < p73 (af wichita) (af cha) 
U Jin 6m J_37 6m J_37 


(37.32) 


1 
2. 


Denoting the two integrals on the right by P and Q, respectively, and applying 
ly(z)|* = 1b (223 combined with the change of variables t = 20, they estimated 


67 


2 ity) 1 ” 4 a1 
P= Id(o%e" )|3dt = — < Ap3(1 — p*) 3. (37.33) 
127 J_6n 20 Je 


The last inequality followed from Littlewood’s inequality (37.29). To estimate Q, they 
first used (37.30) to arrive at 


| 7 / J dtx 12 _l i ey) 1 
Q= | Iw’ (p3e!")|*dt < Ap~3(1 — p3)7!M?(p3, Wp). (37.34) 
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An application of the growth estimate to M 2( 03, w) would not produce the 
necessary result, so they used W3(z3) — $2 (z7) to get 


M?(p3,) = M?(p2,65) < p3(1— p*)-3. 


Combining this with (37.34), they obtained 


2 


Q < Aps(l— p3)"1 p38 < AG = py}. 
Taking this inequality with (37.33) and (37.32) gave the required result: 


Oe cre 2 2 ee 22 -1 
aa IF (oe ides Ap 3p — py sl —py-? <A —p) 
PL sae 
This completed their ingenious proof that the coefficients of odd univalent functions 
were bounded. 


37.6 Karl Lowner and the Parametric Method 


Carathéodory’s theorem was used a decade later by L6wner to construct his parametric 
theory of univalent functions. To describe L6wner’s method, we must first define slit 
mappings. A single-slit mapping is a function mapping a domain conformally onto 
the complex plane minus a single Jordan arc. Léwner showed*? that such mappings 
were dense in S, the set of all conformal mappings of the unit disk with f(0) = 0 
and f’(0) = 1. More exactly, for each f € S, there exists a sequence of single-slit 
mappings f, € S such that f, — f uniformly on compact subsets of the unit disk. 
We follow Duren’s presentation to summarize the argument from Léwner’s 1923 
paper.*» It is sufficient to consider functions f mapping the unit disk onto a domain 
bounded by a (closed) analytic Jordan curve because, for any f €S, the function 
f(z), 0 <r <1, is also univalent with the required image and f(z) = : firz2esS. 
By letting f — 17, we get functions f, € S such that f, — f uniformly on compact 
subsets of the unit disk. So assume that f € S maps the unit disk onto a domain G 
bounded by an analytic Jordan curve C. Choose a point wo € C and let FP’ be any 
Jordan curve from oo to wo. Denote by I, the Jordan curve consisting of I followed 
by a part of C joining wo to a point w, € C. Let G, represent the complement of 
I’, in the complex plane, and let g, map the unit disk onto G,, with g,(0) = O and 
g/, (0) > 1. We note that such a function g,, exists, by the Riemann mapping theorem. 
Now choose a sequence of points w, € C such that w, > wo and Ty, C Pry. 
Then G is the kernel of the sequence {G,,}. By Carathéodory’s kernel convergence 
theorem, we must have g, — f uniformly on compact subsets of the unit disk and, by 


Cauchy’s theorem, g/ (0) > f’(0) = 1. Hence, f, = PaO) is a sequence of single-slit 


32 Loewner (1988) pp. 45-63. 
33 Duren (1983). 
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mappings converging to f uniformly on compact subsets of the unit disk. Thus, we 
conclude with Léwner that the single-slit mappings are dense in S. 

Now suppose f € S is a single-slit mapping taking the unit disk onto a domain 
G, the complement of a Jordan arc T’ extending from a point wo in the complex 
plane to oo. Also suppose that w = w(t), 0 < t < T is a continuous, one-to-one 
parametrization of I with w(O) = wo. Let I’; denote that part of [ from w(t) to 
oo, and let G; represent the complement of I’;. Let g;(z) = g(z,t) be the conformal 
mapping of the unit disc onto G;, with g(0,t) = 0 and g’(0,t) = y(t) > 0, so that 
g(z,t) has the series expansion 


&(zZ) = gt) = yoofz tae tea(t)e+--- \ (37.35) 


where g(z,0) = f(z). By an application of the Schwarz lemma, y(t) may be seen to 
be a monotonically increasing function of t. Thus, by reparametrization, we can take 
y(t) = e'. Moreover, T will then be oo. So we can write 


n@asenae fer moet, 0<t<a, (37.36) 


n= 


in what is called the standard parametrization. Lowner then considered the family of 
mappings 


fi =a; (f@)=e* |: + San (t) «| o Ot 08) (37.37) 
n=2 


It is easy to see that the functions jf; map the unit disk onto the unit disk minus 
an arc extending inward from the boundary, and that e' f; € S. By using the growth 
estimates of Bieberbach and Gronwall, he was able to conclude that 


Jim, e f(z) = f @). (37.38) 


And it is obvious that f(z) = z, the identity function. So the function e‘ f;(z) starts 
at the identity function f(z) = z and ends at f(z) € Sas t — oo. Léwner determined 
the differential equation satisfied by this one parameter family of functions, 


Of, _ pp L+ AOS 

a TA XO fr 
where x(t) was a continuous complex valued function with |x (t)| = 1, 0 < t <oo. 
He also gave the equation satisfied by the family of functions g;(z). By (37.37), 
ar(fi(z)) = f(z). Setting ¢ = f(z), we have g;(¢) = f(z); take the derivative with 
respect to f to get 


(37.39) 
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When this is substituted in (37.39), the result is the differential equation for g;(z): 


0 0 1 t 
Bg 8% 1+xXOZ Qo 4 coo, (37.40) 
ot dz 1l—x(t)z 


where go(z) = f(z) and lim g;(z) = z. 

Lowner applied his parametric method to the Bieberbach conjecture. In his paper 
he deduced only that |a2| < 2 and |a3| < 3.34 Bieberbach suggested that Lowner 
call his paper part I of a work in progress, since it was clear that he had a general 
method applicable to the coefficient problem. To understand Léwner’s derivation 
of the inequalities for the second and third coefficients, note that since the class 
of univalent functions S$ is invariant under rotation, it is sufficient to prove that 
Re(a3) < 3. Now substitute the series (37.37) for f; into the differential equation 
(37.39) and equate the coefficients of z* and z? on both sides to get the two relations 


a}(t) = —2e ‘x (t) (37.41) 
and 
a(t) = —2e"[x (NP — 4e7* x (1) a(t). (37.42) 


Since a2(0) = 0, and jim an(t) = dn, where ay is the nth Taylor coefficient of the 
>0o 


univalent function f € S, we may integrate equation (37.41) to get 


oo le) 
aS / a,(t)dt = —2 / e x(t) dt; (37.43) 
0 0 
hence 
[oe] 
|a2| < 2 | e ‘dt =2 because |x(t)| =1. 
0 


Substituting (37.41) into (37.42) then produces 
a(t) = 2ap(t) a(t) — 2e [xP ; (37.44) 


integrate to obtain 


oo) 2 oo) 
a3 =4([ x(the! ar) -2f x(t) edt: 
0 0 


We next set x(t) = e!9 to get 


love) 2 ee) 2 
Re(a3) = 5 (| cos 6(t) e" ar) — (| sin6(t)e! ar) 
0 0 


CO 
-{ cos 6(t) e~ 7! ar| +1. 
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34 Loewner (1988) pp. 62-63. 
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By the Cauchy—Schwarz inequality 


oo) 2 oo) oo) oe) 
(| cos 0(t) e! ar) < / cos 6(t) e! arf e!dt< / cos’ 6(t) edt, 
0 0 0 0 


so that 


CO 
Re(a3) < ‘| cos? (0) (e™ = af) apant 
0 


[o,e) 
<4 / (1-e*) args. 
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Loéwner also wrote down expressions for aj(t) and then, after integration, the 
expressions for a,(t). However, these are generally too complex to be conveniently 
utilized. 


37.7 De Branges: Proof of Bieberbach 


In proving the Bieberbach conjecture, by first proving the Milin conjecture, de Branges 
applied Lowner’s theory of 1923.°° Though it may appear that Léwner’s theory could 
have been applied at any time, it was not until Milin’s contribution that there was 
a route connecting Léwner to Bieberbach; recall that Milin stated his conjecture for 
logarithmic coefficients only in 1971. De Branges’s great insight was to use special 
functions to prove Milin’s conjecture. And it took the boldness of an independent 
thinker such as de Branges to make such an attempt. As we have mentioned, de 
Branges was a functional analyst. He had developed the theory of square summable 
power series within that context and wished to apply it to various problems, including 
the Bieberbach conjecture. His extensive functional analytic machinery, so useful to 
his insights and manner of thought, proved to be a roadblock for others attempting 
to understand his proof. In the spring of 1984, de Branges presented his proof to 
the members of the Leningrad (St. Petersburg) geometric function theory seminar. 
The members of the seminar generously expended a good deal of effort to help 
him simplify it and express it in classical form. This version of the proof was 
soon written up by Milin and published as a preprint by the Steklov Institute in 
Leningrad. FitzGerald and Pommerenke used this preprint to obtain further technical 
simplifications, also independently found by de Branges. It is the simplified form of 
the proof that we shall discuss here. 

Consider the logarithmic coefficients of the function g;(z) defined by equation 
(37.36). Thus, 


too( #2) = reine, lel <1, (37.45) 


t 
e 
. k=1 


with 0 < t < ow, and c,(0) = 27%, where 27% are the logarithmic coefficients of the 
function f € S. Here recall that g(z,t) = g;(z). If equation (37.45) is differentiated 


35 de Branges (1985). 
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with respect to ¢ and then z and the results are substituted in (37.40) and simplified, 
we get 


= 1 
n=1 


(405 


(1 + 2x (tz +2x(t)?27 ++ ) 5 ae os res ; 


n=1 


(37.46) 


Equate the coefficients of z” on both sides to get the differential equations satisfied 
by the coefficients c,(t): 


n—-1 
c(t) = 2x (t)” +. nen(t) +2 > x(t)" mem(t), n=1,2,.... (37.47) 


m=1 


Note that this is a differential equation for logarithmic coefficients; recall that 
Léwner had obtained similar differential equations for the coefficients of g;(z), 
although they quickly became unwieldy when one tried to solve for them inductively. 
To prove Milin’s conjecture, de Branges made effective use of these differential 
equations, by introducing some special functions. Recall the Milin conjecture: 


n 


Y (mien Oo -=)-m+0 <0, RETI3 oe; 


m=1 
De Branges defined a function 
” 4 
¢(t) = Y (menor - ~ Vent (37.48) 
m 
m=1 
where certain properties were required of T(t), including 
Tm) =n—-—m+1, m=1,2,...,n. (37.49) 
Now, if we can choose Ty,» (ft), such that @’(1) > 0 and ¢(oo) = 0, then we would 
automatically have #(0) < 0, Milin’s conjecture. To compute ¢’(f), first set 
n 
boty 0s BAOZ Some O. MEI es (37.50) 
m=1 
and set 


Trnti(t) =0, for O<t<o, (37.51) 
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so that, by a straightforward calculation, 


n 


go= > (bn ee ee 4) Tam ~ (21bnI? + 4Re(bn)) (icgsstrpuceiy 
m=1 


m 
m=1 


(37.52) 


This expression for ¢’(t) takes a very simple form if the functions T,,,(t) satisfy 
the difference differential equation 


t m t, m+1 
Tn,m — Ta,m+1 = Sa. = PREG (37.53) 
In that case, 
8) =— Yo Ibm + bm +212, (37.54) 
m=1 m 


and Milin’s conjecture is proved, provided that de Branges’s system of functions Ty, 
satisfying (37.49), (37.51), and (37.53) also satisfy 


tat) 30, OR ' 2:00, S12 cast (37.55) 


Thus, from these equations we must determine the form of the functions T(t). 
We may solve successively for Ty, T:,n—1, and so on. So by (37.49), (37.51), and 
(37.53), we may write 


/ 


Tan 


tiai= or 


Tt,n(t) = Ae™ =e ™, 


because T,,,(0) = 1. Next we solve 


/ 


T 

nn—1 - 
teh T See 
n—1 


to get 
Trn—1(t) = —2(n — Le™ 4+. 2ne —", 


Note that in general we obtain 


n—-m 


k 
; (m+k)k!(n—m—k)! 


(37.56) 
k=0 


when m = 1,2,...,n, Since T),n41 = 0. 
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Now recall that the truth of the mth Milin inequality implies the truth of the 
Bieberbach conjecture for (n + 1)th coefficient. Thus, to show that |a2| < 1, it is 
enough to check that t; ,(t) = —e' is negative, and this fact is obvious. For the third 


coefficient, we need to check the derivatives of two polynomials in e~: 


12,1(t) = —2e-* +4e and = 12,2(t) =e". 


Observe that their derivatives are 
det(e?—1)<0 and —2e% <0 (for 0<t <oo), 


respectively. Hence, |a3| < 3. In this manner, de Branges verified the Bieberbach 
conjecture up to the sixth coefficient, although the computations for the last two cases 
were complicated. 

At this point in early February 1984, the stage was set for de Branges to request his 
colleague at Purdue, Walter Gautschi, a numerical analyst with an interest in special 
functions, to check the calculations by computer. Gautschi was swamped with work at 
the time, but he was unable to resist the challenge; he attended de Branges’s seminar 
and reported,*© “I was immediately struck by the clarity, freshness, and elegance of 
Louis’s talk and began to appreciate how those inequalities came about. To my delight, 
they could be written in terms of orthogonal polynomials — currently a subject very 
much on my mind.” Gautschi developed the necessary algorithms and managed to 
verify the Milin conjecture up ton = 30. Wondering if the inequalities could be proved 
analytically, he consulted Richard Askey. It turned out Askey and George Gasper had 
proved a slightly more general inequality less than a decade earlier.>” 


As it turned out, T,, ,,, (t) could be expressed as a sum of Jacobi polynomials: 


n—-m 
Th m(t) = —me—™ S~ Pe — 267), (37.57) 
k=0 


In a 1976 paper, Askey and Gasper had proved*® that for any real a > —1, 


n 
So POP'~@) 20, -1<x<1. (37.58) 
k=0 


This immediately implied de Branges’s inequalities: 1), ,,(t) < 0 for 0 < t < oo. 
Askey and Gasper’s investigation of these sums of Jacobi polynomials arose out of 
their study of several classical inequalities for trigonometric functions. Their insight 
was that the correct generalization for these classical inequalities was in the context 
of Jacobi polynomials, within which the powerful machinery of hypergeometric 
functions could be applied. 


36 Gautschi (1986). 
37 Askey and Gasper (1976). 
38 ibid. 
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One proof of the Askey—Gasper inequality employed a theorem of Clausen on the 
square of a »F, hypergeometric function and also a connection coefficient result of 
Gegenbauer. Note that we have stated these results in the exercises of Chapters 23 
and 24. This second result is also known as the Gegenbauer—Hua formula. Askey has 
pointed out*? that Gegenbauer’s formula having been forgotten, Hua rediscovered it?” 
in the course of his work in harmonic analysis, carried out in the 1940s and 1950s. 
Note that Askey also rediscovered this formula in the 1960s. Hua Loo-Keng (1910- 
1985) taught himself mathematics, and by the age of 19, he was writing papers; 
these came to the notice of a professor at Qinghua University in Beijing. Hua was 
consequently appointed to a position at that university, and his career was launched. 
In 1936, he traveled to Cambridge to work with Hardy, Littlewood, and Davenport 
on problems in additive number theory. Later, he did research in several complex 
variables, automorphic functions, and group theory. His wide interests helped him 
lead the development of modern mathematics in China; in fact, in the 1960s, he turned 
his attention to mathematical problems with immediate practical applicability. Hua’s 
student, Chen Jing-Run (1933-1996) made important contributions to the Goldbach 
conjecture. 


37.8 Exercises 
(1) Show that if 0 < +t < 1 anda > —2, then 


nnta+2, 
12 m3 gy | 2 3h eed. 


See the article by Askey and Gasper in Baernstein (1986). 
(2) Show that if 


love) [oe] 
|ai| 2 S-lail. then the function So daz" /n 


i=2 n=1 


maps the interior of the unit circle upon a star-shaped region with center at the 
origin. 

(3) Show that, with the same condition on the coefficients as in Exercise 2, the 
function °° , ae maps the interior of the unit circle upon a convex region. 
See Alexander (1915) for this and for Exercise 2. 

(4) Prove that if f is an analytic mapping of the unit disk into itself and if z, z1,z2 
are in the unit disk, then 


If@D =f — l= zal, 
Il — f(z f(z2)| ~ 1 — Zizal’ 
I/@l 1 
L=lf Qe Ligh 


39 Schoenberg (1988) vol. 1, p. 192. 
40 Hua (1981) pp. 38-39. 
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ed nee 1lel, ae 
(infinitesimal) in the unit disk, these inequalities may be interpreted to mean 


that the analytic mapping f decreases the noneuclidean distance between 
two points and the noneuclidean length of an arc. This invariant form of 
Schwarz’s lemma is due to Georg Pick; see Pick (1915). Georg Pick (1859- 
1942), professor at Prague for forty-five years, was KGnigsberger’s student 
and L6wner’s teacher; as mentioned earlier, he died in the Theresienstadt 
concentration camp. 

This exercise and the next mention results in geometric function theory, taking 
a direction different from that discussed in the text. If I is a finitely generated 
Kleinian group with region of discontinuity Q, then Q/T is of finite type. Read 
the proof of this theorem in Ahlfors (1982) vol. 2, pp. 273-290. Note that this 
proof has a gap, later filled by Lipman Bers and Ahlfors himself. The theory 
of Kleinian groups was initiated by Poincaré. A nice history of this topic is 
given by Gray (1986). Poincaré (1985) translated into English by Stillwell, 
offers many important papers of Poincaré in this area and also provides a 
useful introduction by putting the papers into historical perspective and relating 
their results to some modern work. For Poincaré’s pioneering work on topology, 
the reader may enjoy the article by Karanbir S. Sarkaria in James (1999) 
pp. 123-167. 

If a Kleinian group is generated by N elements, then Area (Q/T) < 44(N—1) 
and (2/T has at most 84(N —1) components. Bers proved this theorem in 1967; 
read a proof in Bers (1998) vol. 1, pp. 459-477. 


Since the Poincaré metric ds = defines a noneuclidean length element 


37.9 Notes on the Literature 


Gray (1994) presents a history of the Riemann mapping theorem. Duren (1953) is 
an extremely readable and clear exposition and explanation of results on univalent 
functions; it was written just before the Bieberbach conjecture was proved. The books 
of Hayman (1994) and Gong (1999) give proofs of the conjecture. Several articles in 
Baernstein et al. (1986) and the paper by Fomenko and Kuzmina (1986) deal with 
the historical aspects of the Bieberbach conjecture and proof. FitzGerald (1985) and 
Pommerenke (1985) contain the reactions to the proof by two experts in univalent 
function theory. 
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38.1 Preliminary Remarks 


Finite fields are of fundamental importance in pure as well as applied mathematics. 
Applications to coding, combinatorial design, and switching circuits have been made 
since the mid-1900s. Gauss himself first conceived of the theory of finite fields 
between 1796 and 1800, although he published only some of his work in this area, 
so that it did not exert the influence it might have. Gauss’s work arose in the context 
of divisibility problems in number theory. The origins of these questions may, in turn, 
be traced to the work of Fermat who pursued the topic in the course of tackling a 
problem on perfect numbers posed to him by the amateur mathematician Frénicle de 
Bessy, through Mersenne.! This question boiled down to showing that 277 — 1 was 
not prime. In a letter to Frénicle dated October 18, 1640, Fermat wrote that a prime 
p would divide a” — 1 for some n dividing p — 1; this is now called Fermat’s little 
theorem. Moreover, if N = nm, where n was the smallest such number, then p would 
also divide a’ — 1. The second part of the result is easy to understand by observing that 


aN —1=a™—L=(@"—1) (a 4a 4.0 Fa" +1). 


Fermat intended to write a treatise on his number theoretic work, but never did so. 

Because Fermat failed to publish his proofs, Euler had to rediscover them; this 
effort, like many of his projects, stretched over decades. He investigated the structure 
of the set of integers modulo a prime p, among many other questions. He conjectured 
but did not completely prove that there existed an integer a such that a,a?,...,a?~! 
modulo p produced all the integers 1,2, ..., p — 1, though not in that order. Such an 
integer a is called a primitive root of the equation x?~'! = 1 (mod p), or in Gauss’s 
notation x?~! = 1 (mod p). 

In modern terminology, letting Z, denote the integers modulo p, and setting 


Zi, = Zp — {0}, then Euler’s conjecture was that Z> would be cyclic. In fact, 


! For a discussion of Fermat’s work on number theory and references for the little theorem, see Weil (1983); for 
perfect numbers, see pp. 53-56. 
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Euler thought that he had proved this proposition; he presented his efforts to the 
St. Petersburg Academy in 1772.” However, Gauss pointed out in Article 56 of his 
Disquisitiones Arithmeticae that Euler’s proof was incomplete; Gauss proved the 
theorem in full in the 1790s and published his proof in his Disquisitiones. Euler 
worked with Z[x], the ring of polynomials with coefficients in Z, as did Lagrange. 
In 1768, Lagrange proved the basic theorem that any such polynomial of degree m 
would have at most m roots modulo p. 

Gauss was able to delve deeply into the theory of the ring of polynomials over 
finite fields when he perceived that the number theoretic properties of this ring were 
analogous to those of the ring of integers; the irreducible polynomials here played 
the role of the prime numbers. Gauss proved the fundamental theorem that every 
irreducible polynomial P(x) ¢ x of degree m in Z,[z] must divide bee ees | 
in Z,[x]. Gauss also gave a formula for the number of irreducible polynomials of 
degree m. In his derivation, he applied Mobius inversion without an explicit statement 
of the general formula. In 1832, August Mobius (1790-1868), a student of Pfaff and 
Gauss, published this formula,* although it was not much noted. In fact, in 1857, 
Dedekind? and Liouville® published proofs of the inversion formula without reference 
to Mobius. 

Also during the period 1796-1800, Gauss studied the Galois theory of cyclotomic 
extensions of the field Z,, by explicitly constructing the subfields of the splitting field 
of the polynomial x” — 1 over Zp, where v was a positive integer not divisible by p. 
He saw this theory as analogous to his cyclotomic theory over the field of rational 
numbers. Gauss applied his results to obtain a new proof of the law of quadratic 
reciprocity. He intended to include his work on the extensions of finite fields as the 
eighth section of his 1801 Disquisitiones, but omitted it to make room for his theory of 
binary quadratic forms, completed after 1798. In fact, binary quadratic forms occupied 
more than half of the published book, so that Gauss could include in the text only 
references to his unpublished work on finite fields. 

In 1830, Galois published his theory of algebraic extensions of finite fields’ by 
using numbers analogous to complex numbers, known in the nineteenth century as 
Galois imaginaries. These numbers were required in order to extend the field Zp. 
For example, if a polynomial F(x) of degree v was irreducible over Zp, then Galois 
assumed i to be the imaginary solution of F(x) = 0; he then showed that the set 
consisting of the p” expressions ag +ayi ++ --+ay_ i”! could be given the structure 
of a field and that F(x) could be completely factored in this field. It is interesting to 
note that Gauss preferred to avoid imaginary roots. For example, in the unpublished 
eighth section of the Disquisitiones, he wrote,® 


2 Eu. 1-3 pp. 240-281. E 449 § 37. 

3 Lagrange (1867-1892) vol. 2, pp. 667-669. Gauss refers to this work in Gauss (1965) p. 27. 
4 Mobius (1832). 

5 Dedekind (1857). 

® Liouville (1857). 

7 Galois (1830). 

8 Frei (2007) p. 180. See also Gauss (1981) p. 607. 
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It is clear that the congruence € = 0 does not have real roots if € has no factors of dimension 
one; but nothing prevents us from decomposing &, nevertheless, into factors of two, three or more 
dimensions, whereupon, in some sense, imaginary roots could be attributed to them. Indeed, we 
could have shortened incomparably all our following investigations, had we wanted to introduce 
such imaginary quantities by taking the same liberty some more recent mathematicians have 
taken; but nevertheless, we have preferred to deduce everything from [first] principles. Perhaps, 
we Shall explain our view on this matter in more detail on another occasion. 


In 1845, Theodor Schénemann, unaware of Galois’s work, published a paper on 
algebraic extensions of Dge By application of his theory, he partially recovered 
Kummer’s result on the factorization of a prime g + p in the cyclotomic field 
generated by a pth root of unity. SchGnemann also applied his theory to prove the 
irreducibility of the cyclotomic polynomial x7~! + x?~-* +... +x + 1. In 1857, 
Dedekind began to develop a theory of finite fields, in order to generalize Kummer’s 
theory of ideals in cyclotomic fields and to place it on a firm logical foundation. 
In 1871, Dedekind published his first version of this generalization as his theory of 
algebraic numbers. In this work, given a polynomial irreducible over the rational 
numbers, he delineated the relation between the factorization of this polynomial 
modulo p and the prime ideal factorization of the ideal generated by p in the number 
field arising out of the polynomial. We remark that Dedekind was familiar with the 
work of Schénemann and of Galois. Note that Galois’s 1830 paper was republished 
by Liouville in the 1840s and that J. A. Serret’s 1854 algebra book!° discussed the 
work of Galois in detail. 

Richard Dedekind (1831-1916) was the Ph.D. student of Gauss and mathematical 
friend of Dirichlet. Dedekind performed the valuable service of editing the works of 
Riemann and Gauss and the lectures of Dirichlet. In his 1857 paper, Dedekind care- 
fully showed that results in elementary number theory, including Fermat’s theorem 
of Euler’s generalization that a?) = 1 mod m, with a and m relatively prime, 
could be carried over to the ring of polynomials over finite fields. Dedekind also 
stated and proved the corresponding law of quadratic reciprocity. In 1902, Hermann 
Kiihne proved the general reciprocity theorem in F,[x], a finite field with g = p™ 
elements.!! This theorem was rediscovered in 1925 by Friedrich K. Schmidt, and then 
again in 1932 by Leonard Carlitz. The reciprocity laws are more easily proved for 
polynomials in Fy[x] than for integers. In 1914, Heinrich Kornblum (1890-1914), 
student of Landau, further developed this analogy by defining L-functions for F’y[x] 
and proving an analog of Dirichlet’s theorem on primes in arithmetic progressions. 
Unfortunately, Kornblum was killed in World War I, but Landau published the work 
in 1919. It may also be of interest to note that Gauss used the zeta function for Zp[x] to 
find a formula for the number of irreducible polynomials of degree n, although he did 
not express it in such terms. Gauss’s formula implies that if (7) denotes the number of 
irreducible polynomials of degree n, then 


9 Schénemann (1845). 
10 Serret (1854). 
! For more detail and references on this topic, see Frei (2007). See also Rosen (2002) chapters 2-4. 
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n n/2 
m-P=o(? ‘; 
n n 


set x = p”, thenn = log, x, to obtain 


We" 46 we). 
log, x log, x 


Note the similarity in appearance between this equation and the conjectured form 
of the number of primes less than x, following from the unproven Riemann hypothesis 
on the nontrivial zeros of the Riemann zeta function. Note that in 1973, Pierre Deligne 
established the Riemann hypothesis for the zeta function of smooth projective varieties 
over finite fields.!* He based his proof on the novel framework for algebraic geometry 
created by Alexander Grothendieck and his collaborators, including Deligne. Weil 
conjectured this theorem in 1949, so that it was also known as Weil’s conjecture. Even 
earlier, special cases of the Riemann hypothesis over finite fields had been proved; 
Emil Artin (1898-1962) presented the earliest example in 1921, when he defined 
and studied the zeta function of a quadratic extension of the field Zp(x) and proved 
the Riemann hypothesis for that case. Artin’s advisor, Gustav Herglotz (1881-1953), 
proposed the problem after reading Kornblum’s posthumously published thesis. 


38.2 Euler’s Proof of Fermat’s Little Theorem 


In 1736, in his second paper on number theory, Euler presented an inductive proof of 
Fermat’s theorem.!> He stated the result that if a? — a is divisible by p, then (a + 
1)? —(a+1) is also divisible by p. To see this, consider that by the binomial theorem, 


—1 
(a+ 1)? =a? + =a?! 4 PP 5 a ee sat] 
or 
—1 
(a +1)? —a? p= Paget 4 PP gh Pte t Pa, 


The right-hand side has p as a factor in each term, and therefore p divides the 
left-hand side 


(a+ 1)? —a? -—1=(a4+1)? -—(a4+1)-a? +a. 


This implies the required result, that if p divides a? — a, it must also divide 
(a+ 1)? — (a+1). Since the result is true for a = 1, it is true for all positive integers 
a. Moreover, if p does not divide a, then since p divides a(a? —! _ 1), we obtain the 
result that p divides a?~! — 1. 


12 See Katz (1976) for a sketch of Deligne’s proof. 
13 Eu. 1-2 pp. 33-37. E54. 
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For Euler’s multiplicative proof of Fermat’s theorem,'+ we follow the concise 


presentation in Gauss’s Disquisitiones.'> Suppose a prime p does not divide a positive 
integer a. Then there are at most p — 1 different remainders when l,a,a”,... are 
divided by p. So let a” and a” have the same remainder with m > n. Then a” — | 
is divisible by p. Let t be the least integer such that p divides a’ — 1. If t = p — 1, 
our proof is complete. If t # p — 1, then l,a,a’,...,a'—! have t distinct remainders 
when divided by p. Thus, we can choose an integer b, not divisible by p and not 
among 1,a,a”,...,a’~! modulo p, and consider the numbers b,ab,a7b, ...,a'~'b. 
Each of these numbers also leaves a different remainder after division by p and each 
of these is different from the previous set of remainders. Hence, we have 2t < p—1 
remainders. If 2t = p — 1, then our proof is complete. If not, then we can continue the 
process until some multiple of t is p — 1. This completes the proof of the theorem. 


38.3 Gauss’s Proof That Z>, Is Cyclic 


In one of his two proofs that Z> is cyclic, Gauss used a theorem of Lagrange: 
Assuming A ¥ 0 modulo p, the congruence 


Ax™ + Bx™!4.Cx™?24...-4Mx+N=0 (mod p) 


has at most m noncongruent solutions. Gauss presented a proof of this result,!® similar 
to that of Lagrange but more succinct. It is easy to see that a congruence of degree 1 
has at most one solution. Assume, with Gauss, that m is the lowest degree for which 
the result is false; then m > 2. Suppose that the preceding congruence has at least 
m+ 1 roots, a, B,y,..., whereeO <a <B<y<---<p-—1.Nowsety=x+a 
so that the congruence takes the form 


Aly By hae Co eee eM yA N’ = 0) Giiodip), 


Note that this congruence has at least m + 1 solutions 0,8 — a,y —a,.... Since 
y = 0isa solution, we must have N’ = 0. Thus, 


y(A’y”! fe Bly"? ce cy? aCe M') =O (mod p). 


If y is replaced by any of the m values B —a, y —a, ..., then the identity is satisfied 
but y is not zero. This means that the m values 6 — a,y — a,... are solutions of the 
m — 1 degree congruence 


Aly™—! + Bly? 4 Cly™ 3 4-.-4+M'=0, (A =A#0). 
This contradicts the statement that m is the least integer for which the result is 


false, proving the theorem. Gauss thought that this theorem was significant; in his 


14 Bu. 1-2 pp. 493-518. E 262. 
'5 Gauss (1965) art. 49-50 or Gauss (1863-1927) vol. I, pp. 40-42. 
16 Gauss (1965) art. 42 or Gauss (1863-1927) vol. I, pp. 34-35. 
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Disquisitiones he discussed its history, pointing out that Euler had found special cases, 
Legendre had given it in his dissertation, and that in 1768 Lagrange had been the first 
to state and prove it. 

Gauss used this result to prove the proposition that there always exist primitive 
(p — 1)th roots of unity modulo p.'’ Suppose that p — 1 = a%b®cY ---, where 
a,b,c,... are distinct primes. The first step is to show the existence of integers 
A,B,C,... of orders a%,b?,c’,... respectively. Note that by Lagrange’s theorem 
given above, the congruence 


a =1 (modp) 


has at most ph solutions. Hence there is an integer g, 1 < g < p — 1, nota solution 


-1 
of the congruence. Now let h be an integer, 1 < h < p — 1, congruent to gat, It is 
clear that h® = 1 (mod p), but that no power d less that a® will give h4 = 1. This is 


P pal 
because d must take the form a/ with j < k, so that h4 = ga" # 1 by the definition 
of g. We may take A to be h and similarly find B,C, .... We can now show that the 
order of y = ABC .--- is p — 1. To see this, suppose, without loss of generality, that 


the order of y divides ue Since b?,c’,... also divide 2 = , it follows that 


p= pa! 


= = = 
jsye =AeBece...=Aa (modp). 


This implies that a® divides es an impossibility. Thus, the theorem is proved. 
This argument of Gauss can be extended to arbitrary finite fields. By a different 
argument, Gauss showed that the number of primitive roots of unity modulo p was 


equal to d(p — 1). 


38.4 Gauss on Irreducible Polynomials Modulo a Prime 
To count the number of irreducible polynomials of a given degree modulo a prime p, 
Gauss started with the observation that the number of monic polynomials (mod p) 
SE Aa AB Sak ae ea 
was p”, because each of the n coefficients A, B,C,... would take exactly p values 


O12 as pee 


Thus, there were p polynomials of degree 1, all irreducible. Gauss then remarked that 
it followed from the theory of combinations that the number of (monic) reducible 
degree-two polynomials was tay So the number of irreducible ones would be 
given by 
2 p(ipt+1)_ p’-p 
. a pe 


17 Gauss (1965) art. 55 or Gauss (1863-1927) vol. I, pp. 44-46. 
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To determine the irreducible polynomials of higher degree, Gauss devised a 
notation and method.!® He let (a) denote the number of irreducible polynomials 
of degree a and (a®) the number of polynomials of degree wa factorizable into a 
irreducible polynomials of degree a. Gauss represented the number of polynomials of 
degreea+28+3y-+.--- witha factors of degree 1, 6 factors of degree 2, y factors 
of degree 3, etc. by (1%2°34?° .. -). It followed that 


(19283748...) = (1%)(28)(37) (49). -. 


Again, Gauss remarked that the theory of combinations implied that 


_@ [@+ [@+2 [@+e-1 


@= 4 2 3 i 


Though Gauss did not bother, it is not difficult to prove this: Let pj (x), p2(x), ..., 
Pca) (x) be the irreducible polynomials of degree a. In a factorization of a polynomial 
of degree aa, let y; factors be p(x), i = 1,2,..., (a). Then (a%) will be the number 
of nonnegative solutions of the equation 


yityotrr- + ya) =a, 
the solution of which yields the same value given by Gauss. He then noted that 
p= (1), 
p’ = (1°) + Q), 
p=()+(1-2)+@), 
pi =(1*)+ (17-2) +(1-3)+ 27) + @), 


and so on. Using the formula for (a*), he found the following eight values: 


A, 59 
(1) = p, @=F? : 
; 4 na2=P 
(2) = c=, _ P—-p a. 
: gat =2 
_P—P 65 8 OD 8 
etn eae oat poe te 
Solving these equations led him to: 
pad); p> =5(5) + (1), 
p> = 2(2) + (1), p® = 6(6) + 3(3) + 2(2) + (1), 
p’ =3@)+ (1), p’ =77)+(), 
pt = 4(4) + 2(2) + (1), p> = 8(8) + 4(4) + 2(2) + (1). 


18 Gauss (1981) pp. 609-611. 
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These results then suggested to Gauss that 


p" = a(a) + B(B) + v(v) + 6(8) +---, 


where a, 6, y,5,... were all the divisors of n. He sketched a proof, using generating 
functions. Although there are a few missing lines in Gauss’s manuscript, it is not 
difficult to fill in the details. He wrote that the product 


1 (1) 1 (2) 1 (3) 
(—) (3) (3) “ 28) 


could be developed into the series 


1+ Ax + Bx? +---=P, (38.2) 


where A = p, B = p*, C = p*,.... Then by taking the logarithmic derivative of 
(38.1), he got 


xdP (Ix | 2Q)x?_ 3@G)x7 | (38.3) 
Pdx 1—x* 1—x2' 1—x3' ; 


The required result followed after expanding the terms as an infinite series and 
equating coefficients. Gauss gave no details, but to prove (38.2), let f denote a monic 
irreducible polynomial. Since p” is the number of monic polynomials of degree n, we 
see that by unique factorization of polynomials 


i 
= {eyecare ra ie” a (38.4) 
1 — px 


Me 


(number of monic polynomials of degree n) x” 


n=0 
=[] (tx 4 xsef? 4 2) 
: 
=] [a geet 
f 
CO 
=T[a-«4), 


x 
ll 
wr 


where the notation | | y Stands for the product over all irreducible polynomials. This 
proves Gauss’s assertion that product (38.1) equals the series (38.2). Now by (38.3) 


n CO 


px | d(d)x4 _ 4 
eras Seg dad) a 


n=1 \d|n 


38.5 Galois on Finite Fields 395 


Gauss then equated the coefficients of x” on each side to get 


p"= 5° d(d). (38.5) 


d\n 


He then inverted this formula to get (1) in terms of p”, stating that ifn = a%b? 
c’ --- where a,b,c, ... were distinct primes, then 


n(n) = p"—* pa + pa —\* paw +... (38.6) 


Gauss wrote that, for example, when n = 36, he had 
36(36) = p*° — p'® + p'? + p’. 


As acorollary of (38.6), Gauss observed that if n = a®, with a prime, then 


n 
Wu a 


Pp’ =p (modn). 


And for a = | and a prime to p, the result was 
p’-'=1 (moda). 


Note that it is also easy to see from (38.6) that n(n) > 0. Thus, there are irreducible 
polynomials of every degree n. Gauss gave no proof of the inversion (38.6) of (38.5). 
This means that Gauss knew the Mobius inversion formula before 1800 when he 
wrote up his researches on polynomials over the integers modulo p; MG6bius’s paper 
appeared in 1832.!° 


38.5 Galois on Finite Fields 


Although the French mathematician Evariste Galois’s (1811-1832) research career 
lasted less than four years, his accomplishments have had lasting value and impor- 
tance. Galois’s premature death came about as a result of a tragic duel, the cause of 
which is not fully understood, but was possibly related to Galois’s political activities. 
In his number theory report of 1859-66, Smith wrote on Galois:7° 


His mathematical works are collected in Liouville’s Journal, vol. xl. p. 381. Obscure and 
fragmentary as some of these papers are, they nevertheless evince an extraordinary genius, 
unparalleled, perhaps, for its early maturity, except by that of Pascal. It is impossible to read 
without emotion the letter in which, on the day before his death and in anticipation of it, Galois 
endeavours to rescue from oblivion the unfinished researches which have given him a place for 
ever in the history of mathematical science. 


Galois published his first paper in 1828 on purely periodic continued fractions,”! a 


topic studied by Euler and Lagrange in the 1760s. Euler had shown that a periodic 


19 Mobius (1832). 
20 Smith (1965b) p. 149. 
21 For an English translation, see Neumann (2011) pp. 36-47. 
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continued fraction would satisfy a quadratic equation with integer coefficients. 
Lagrange proved the more difficult converse, that a quadratic irrational number, a 
number satisfying a quadratic equation with integer coefficients, could be expressed 
as a periodic continued fraction. Galois explicitly proved a theorem implicitly in 


Lagrange: For integers ao, a1, ...,@n, if the continued fraction 
1 1 1 1 1 1 
ao } sett ten 
a+ azo Ant aot Ant aot 


is a solution of a polynomial equation with integer coefficients, then the continued 
fraction 


is also a solution of the same polynomial equation. 

From a very early age, Galois had plans to develop the theory of algebraic 
extensions of fields. In this context, in 1830 Galois wrote his paper “Sur la théorie 
des nombres,”?* rediscovering Gauss’s unpublished results in this area and creating 
the theory of finite fields. Galois started with an equation F(x) = 0 modulo p, with 
F(x) having integer coefficients and irreducible modulo p. Note that by this he meant 
that there could not exist polynomials #(x), w(x), and x(x) with integer coefficients 


such that 


p(x) WX) = FO) + px (x). 


After the initial portion of the paper, Galois omitted the modulo p, writing simply 
F(x) = 0. This means that he was assuming that the coefficients of the polynomials 
were taken from the finite field, integers modulo p. Galois argued that since F(x) was 
irreducible, the equation F(x) = 0 had no solutions in integers (more precisely and 
in modern terms, no solutions in the finite field). He supposed F to be of degree v 
and denoted by i an imaginary solution of F(x) = 0. Galois explained this imaginary 
solution by drawing an analogy with complex numbers. He let @ denote any one of 
the p” — 1 expressions 


ata aim te tq’, (38.7) 
where a, 1,42, ...,@,)—1 took values in the finite field, that is, aj = 0,1,...,p — 1, 
though all the as could not be zero. Then a,a?,a3,... would all be expressions 


of the form (38.7), since if the degree of i were v or higher, then F(x) = 0 
could be used to express i” in the form (38.7). Next, since there were only p” — 1 
different such expressions, Galois had a* = a! for two different integers k and /, 
or a! (ak! — 1) = 0 (for k > 1). From the irreducibility of F, Galois arrived at 
a‘! — 1, Letting n be the least positive integer such that w” = 1, Galois noted 
that 1,a,a?,...,a”—! were distinct; moreover, if B were another expression of the 


22 Galois (1897) pp. 15-23. Neumann (2011) pp. 62-75 gives an English translation. 
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form (38.7), then 6, Ba,..., gate} would be n additional elements distinct from each 
other and from the a/. Moreover, if 2n < p” — 1, then there would be yet another 
element in (38.7) distinct from the known 2n elements. Because this process could 
be continued, Galois concluded that n divided p” — 1, meaning that aP’-1 = | 
for every a of the form (38.7). This is Galois’s generalization of Fermat’s theorem. 
Here Galois also observed that by the known methods of number theory (in fact, by 
Gauss’s published argument outlined earlier), there was a primitive root a for which 
n = p” — 1. Moreover, any primitive root had to satisfy a congruence of degree v 
irreducible modulo p. 

Note that Galois’s generalization of Fermat’s theorem implied that all members of 
(38.7), including 0, were roots of the polynomial x?" — x. And every irreducible F (x) 
would divide x?’ — x modulo p. We now continue to follow Galois. Because 


(F(x))" =F (x?") (mod p), 


the roots of F(x) = 0 had to be i, iP iP, ...,i?'~!. Thus, he saw that all the roots of 
xP" = x were polynomials in any root @ of an irreducible polynomial of degree v. To 
find all the irreducible factors of x?" — x, he factored out all polynomials dividing 
xP" — x for w < v. The remaining product of polynomials was then a product 
of irreducible polynomials of degree v. Galois pointed out that, since each of their 
roots was expressible in terms of a single root, these were obtainable through Gauss’s 
method. Recall that Galois did not write modulo p repeatedly because he saw the 
coefficients of F to be elements of the finite field Zp. 

Galois then gave an example in which p = 7 and v = 3. He here showed how to 
find the generator a of the multiplicative group of this field, as well as the irreducible 
polynomial equation satisfied by w. He noted that x* —2 was an irreducible polynomial 
of degree 3 modulo 7 and hence the roots of x? x would be a + ayi + api” where 
a, a), took values 0,1, ...,6 and i>? = 2. Galois denoted i by /2 and then wrote the 
roots as 


ata/24+ aV4. 


To find a primitive root of x? — x, Galois noted that 7? — 1 = 2-32-19, so that he 
needed primitive roots of the three equations: 


He observed that x = —1 was a primitive root of the first equation. He cleverly 
noted that 


x ~1=(3—1)03 —2)(3—4)  (mod7), 


so that where i> = 2, i was a primitive root of the second equation. For the third 
equation, Galois took x = a + aji so that 


(a+ayi)? =1. 
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Expanding by the binomial theorem, referred to by Galois as Newton’s formula, 
and employing 


qQne-) 1, qe =a: oe e: Eg 


he reduced the expression modulo 7 to 


3[a a‘a} | (a>a? | a’a;)i°| =1, (38.8) 
so that 
3a — 3a4a} =1, @ajt+ aa? = 0. (38.9) 


Galois saw that equations (38.8) and (38.9) were satisfied (modulo 7) by a = —1 
and a; = 1. He therefore concluded that —1 +i was a primitive root of x!? = 1. Thus, 
the product i — i* of the three primitive roots, —1,i, and —1 +i, was a primitive root 


P-1 


of x = 1. By eliminating i from 


ih =2 and a=i-i’, 
he obtained the irreducible equation for the primitive root a, 
3 = 
a> —-a+2=0. 


Thus, w would generate all the nonzero elements of a finite field of 77> members. 
Galois ended his paper with the observation that an arbitrary polynomial F(x) of 
degree n has n real or imaginary roots. The real roots, assuming no multiple roots, 
could be found from the greatest common divisor of F(x) and x?~! — 1. Note that 
this can be obtained by means of the Euclidean algorithm. And the imaginary roots of 
degree 2 could be obtained from the greatest common divisor of F (x) and Kh les 1; 
this process could clearly be continued. 


38.6 Dedekind’s Formula 


With his characteristic systematic approach, in his paper of 1857, Dedekind explained 
how to develop the theory of polynomials over finite fields such that the analogy 
with the ring of integers was completely clear. We present just one formula from his 
paper, deriving an elegant expression for the product of all irreducible polynomials of 
degree d. The arguments given by Galois showed that if Fg(x) was the product of all 
irreducible polynomials of degree d, then 


xP" y= I] Fy(x). 


d\n 


By the multiplicative form of the Mdébius inversion formula, a proof of which 
Dedekind provided, since it was not generally known in 1857, he obtained 
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(xP" — x) TT (x? —x)- 
TI (x? — x) T] (xP™ = Vics 


Here a,b,c, ... denoted the distinct prime divisors of n. Thus, 


Fy (x) = 


n 


TT 7! — 3) = (oP — 2) er? — ay (er? — a). 
With the use of the Mobius function jz, Dedekind’s formula can now be written as 
Fi(x) = I] (xP! = ge. 
d\n 


Note that Mobius stated his inversion formula for a sum; Dedekind extended it to 
cover a product. We observe that the symbol yz for the Mobius function was introduced 
by Mertens in 1874.73 


38.7 Finite Field Analogs of the Gamma and Beta Integrals 


In Chapter 17, we noted that the gamma integral is given by 


CO 
/ pole da 
0 


1 
/ Pod ay de 
0 


Next observe that y(t) = e“' is the general nonzero and real-valued solution of the 
functional equation 


and the beta integral by 


Wa + y) = wOo)y(y). (38.10) 


To verify this, suppose w is nonzero, continuous, and satisfies (38.10). Clearly, 
w(0O) = 1 and W(x) > 0. Let Y(1) = e°, with c some constant. The argument used 
by Euler in Section 4.3 can be applied here to show that for integers m and n, 


m m 
(7) — ct (38.11) 


Since y is continuous, (38.11) implies that 


w(t) =e. 


23 Mertens (1874b). 
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Similarly, x(t) = t® is the general nonzero solution of 


xX (xy) = x(X)xy); (38.12) 


to see this, observe that y(t) = x (e’) satisfies (38.10). 

We mention that Kac proved that to show that y(t) = e, it is sufficient to assume 
that y satisfying (38.10) is integrable.?* To prove this, choose a real number a such 
that [> w(t)dt # 0. Then 


a a a+x 
woo f woods = [ wotnde = f Ww (t)dt 
0 0 x 
and thus 


fe woat 
fs w@at © 


Equation (38.13) implies that if yw is integrable and satisfies (38.10), then it must 
be continuous; hence, y(t) = e“. 

Now observe that the gamma function is the integral of the product of a function 
satisfying (38.12) and a function satisfying (38.10). Moreover, the beta integral is the 
integral of two functions, t* —! and (1 —1)°!, both of which satisfy (38.12); also, the 
sum of ¢t and 1 — ¢ is one. 

We may now look for the finite field analogs of the gamma and beta integrals. 
Gauss, Jacobi, and Eisenstein first determined these analogs, working, as we shall, in 
the finite field Z,,, the integers modulo a prime p. A function defined in Z, has to be 
periodic, with period p; it is natural, then, to consider 


u(x) (38.13) 


2mikx 


Wa)=e?, k=0,1,2,...,p—1 
as the p basic functions satisfying (38.10). Thus, w is a map satisfying (38.10): 
wi Zp C. 


Since it satisfies (x + y) = w(x)W(y), we denote w an additive character. 
Similarly, a multiplicative character x is a map 


X:Zyp—>C 


such that x (mn) = x (m)x (n) and x (0) =0. We have seen how multiplicative charac- 
ters can be constructed: Let g generate the cyclic group Z, \ {0} and let y(n) denote 
the integer (mod (p — 1)) such that 


g’™ =n (mod p). (38.14) 


24 Kac (1936-1937). 


38.7 Finite Field Analogs of the Gamma and Beta Integrals 401 


Then 


eg! ™ = mn = g¥™ g¥™ = g¥™+Y (mod p) 
and 
y(mn) = y(m) + y(n) (mod (p — 1)). 


Ifo= erly k =1,2,...,p — 1, then multiplicative characters will be defined by 
x(n) =o’ ne Zp \ {O}, and x (0) = 0. 

Now because integrals become sums when defined on finite sets, the sums 
corresponding to the gamma and beta integrals would be given respectively by 


p-l 
S> x@ycn), (38.15) 
n=0 
p-l 
> x1) x2(1 = 1), (38.16) 


n=0 


where x1 and x2 are two multiplicative characters. The two sums (38.15) and (38.16) 
are designated, respectively, Gauss and Jacobi sums; Gauss sums have also been called 
Lagrange resolvents. The reasons for these names can be understood by delving into 
the history of this topic. 

In 1770-1771, Lagrange published, in two parts, his paper, “Réflections sur la 
résolution algébriques des équations,”?> in the Berlin Academy Journal. In this paper, 
Lagrange discussed the various methods that had previously been used to solve, by 
radicals, equations of degree 2,3 and 4. He also analyzed these equations, and general 
equations of the nth degree, by means of permutations of the roots. In article 86 of his 
paper, he wrote that the simplest expression that solved equations of degrees 2,3 and 
4 could be written as 


Xp + wx. + 0x3 +--+ 0" Xp, (38.17) 


where x1,x2,...,X, represented the n roots of the equation of degree n and w ¥ 1 
was a root of w” — 1 = 0. The expression (38.17) is called the Lagrange resolvent. 
The idea of this resolvent was also discovered by Alexandre-Théophile Vandermonde 
(1735-1796) at about the same time. Both Lagrange and Vandermonde applied their 
ideas to solve the equation 


al 
x 2 


x—1 


where p was a prime. Of course, for p = 2,3,5,7, the equation is not difficult to solve; 
when p = 3,5,7, the substitution y = x + + reduces the equations to those of degree 
1,2,3 respectively. The same substitution reduces the equation for p = 11 to one 


25 Lagrange (1770-1771). 
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of degree 5, an equation that apparently stumped Lagrange. However, Vandermonde 
constructed a solution for this fifth-degree equation.?° His idea was to choose the ten 


= 0 in (38.17) in a certain order; he took the kth term x;, | < k < 10, 
to be 8B’, where B was a primitive tenth root of unity and v(k) was the least positive 


integer such that 


roots of 


2°!) =k mod 11. 


Note that 2 is a primitive root modulo 11. Thus, Vandermonde had used a Gauss 
sum to solve x!! —1=0. 

From entry 37 in his mathematical journal, it appears that Gauss may have 
rediscovered the idea of a Lagrange resolvent; it is very possible that Gauss was 
unaware of Vandermonde’s paper on the solution of algebraic equations when he 
began work on the cyclotomic equation x” — 1 = 0. Gauss’s motivation in studying 
this equation was number theoretic. In the preface to his Disquisitiones Arithmeticae, 
he wrote,’ 


The theory of the division of the circle or of a regular polygon treated in Section VII of itself does 
not pertain to Arithmetic but the principles involved depend uniquely on Higher Arithmetic. This 
will perhaps prove unexpected to geometers, but I hope they will be equally pleased with the new 
results that derive from this treatment. 


One such result was that the roots of the equation x” — 1 = 0 could be expressed in 
terms of radicals. In article 359 of his Disquisitiones, he wrote, 


Everyone knows that the most eminent geometers have been ineffectual in the search for a general 
solution of equations higher than the fourth degree, or (to define the search more accurately) for 
the REDUCTION OF MIXED EQUATIONS TO PURE EQUATIONS. And there is little doubt 
that this problem does not so much defy modern methods of analysis as that it proposes the 
impossible (cf. what we said on this subject in Demonstratio nova, art. 9). Nevertheless it is 
certain that there are innumerable mixed equations of every degree which admit a reduction to 
pure equations, and we trust the geometers will find it gratifying if we show that our equations 
are always of this kind. But because of the length of this discussion we will present here only the 
most important principles necessary to show the possibility of our claim and reserve for another 
time a more complete consideration worthy of this argument. 


As a particular case, Gauss proved his famous result on constructible regular polygons; 
the solutions of the corresponding equations have only square roots as radicals. Gauss 
used results on products of Gauss sums in his analysis in the Disquisitiones; though 
he did not present complete proofs there, he wrote that he indeed had them. In fact, he 
gave such proofs in an unpublished paper,7® “Disquisitionum circa aequations puras 
ulterior evolutio.” 

In a February 1827 letter to Gauss,”? Jacobi discussed at length his discovery of 
some properties of Gauss sums. Apparently, Gauss did not give a reply to this letter, 


26 Vandermonde (1770-1771), especially pp. 415-416. 
27 Gauss (1965) p. xx. 

28 Gauss (1863-1927) vol. 2, pp. 243-265. 

29 Jacobi (1969) vol. 7, pp. 393-402. 
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perhaps because the proofs of most of the results mentioned by Jacobi were contained 
in Gauss’s paper on pure equations. Jacobi then published a paper on this topic,°° in 
which he observed the analogy between the gamma or beta integrals and the Gauss or 
Jacobi sums; in fact, Jacobi also stated an analog of the multidimensional integral of 
Dirichlet. 

Turning to a discussion of the properties of Gauss and Jacobi sums, we follow 
Eisenstein’s 1844 exposition?! because of its very systematic approach. Eisenstein 
was apparently unaware of Jacobi’s 1837 paper when he published this work. Note 
that in 1846, Jacobi republished his paper in Crelle’s Journal.>* 

Eisenstein started with the sum, with a and £ integers 


2mia v(k) 2niBk 


p-l 
o,p)= die rl cer, (38.18) 
k=1 


where v(k) was defined by (38.14). Following Gauss, Eisenstein denoted v(k) by 
“Ind. k,” a notation introduced by Gauss in article 59 of his Disquisitiones. Note that 
the first term in the sum in (38.18) denotes a multiplicative character; we denote it by 
x(k) so that we may write (38.18) as 


b (a, B) = gp(X). (38.19) 
Eisenstein noted some simple properties of these Gauss sums: 


(a) (a, B) = &(a’, B’), when a = a’ (mod p — 1) and B = f’ (mod p); 
(b) d(a,0) = 0, when a ¥ 0 (mod p — 1); 

(c) P(0, B) = —1, when 6 ¥ 0 (mod p); 

(d) (0,0) = p—1. 


We indicate a proof of (b), where the condition a 4 0 (mod p — 1) means that there 
exists ak € Zp \ {0} such that x(k) A 1. Then 


p-l p-l p-l 
(0,0) = go(x) = >> xs) = D5 xGK) = x) DS x). 


s=1 s=l1 sl 


Since x(k) 4 1, we have Bal x(s) = 0. Note that sk and s run over all integers 
1,2,...,p—L. 

We call x a trivial character if x(k) = 1 for all k € Zp» \ {0}; this corresponds to 
a = 0. Eisenstein proved for nontrivial characters x that when 6 4 0 (mod p), 


ge(x) = x(B') a1 003 (38.20) 


30 Jacobi (1837). 
31 Bisenstein (1844). 
32. Jacobi (1846). 
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note that, since B 4 0 (mod p), 67! exists. Eisenstein’s proof was that 


oe 2ips =, 2nis 
sa = >_> xsye” => x(B's)e# 
s=l s=1 
a 2nis 
=x(B')>) x(se? =x(B') gin. 


cSt 


We observe that s could be replaced by B's because B~'s runs through all integers 
1,2,...,p —1, as does s. Also note that we can write x(B7!) = X(f) because 


x(B-')x(B) =x) =1 and X(B)x(6) = 1. 


So by using g(x) for g; (x), we can rewrite (38.20) as 


&6(X) = X(B) g(x), BAO (mod p). (38.21) 
Eisenstein next proved that for a 4 0 (mod p — 1), 
(a, 1) $(—a, 1) = (-1)*p 
or for nontrivial x, 
a(x) 8k) =x Dp. (38.22) 


To prove (38.22), Eisenstein observed that 


pot Bod 2mi(m-+n) 
ss = >> Xm) Yo xe? 
m=1 n=1 


Given an m, he noted, for each n there must exist a o such that n = om. Thus 


— ae =~ pal 2ni(o+l)m 
ssn = > Xim) Y) x(ome? 

m=1 o=1 
as po) 2ni(o+l)m 

=>) x) 2 x@mxime 
o=l m=1 
pot se Ini(o-+1)m 

=\\x@) Yer. (38.23) 
o=l m=1 


He went on to point out that when o = p — 1, the inner sum in (38.23) was equal 
to p — 1 and for other values of o, it was —1. Thus 
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p-2 
gM sd =— >. x) + (P—Dx(p— 1) 


o=1 


p-l 
=-)> x(o)+px(p—-1) 
o=1 


=0+ px(-l). 


Eisenstein then established the important relationship between Gauss and Jacobi 
sums, proving that for nontrivial multiplicative characters x and n, such that x7 would 
also be nontrivial, 


swam oy 
GA ee te ). (38.24) 


o=1 


Note that we can rewrite the right-hand side of (38.24) as 


p-2 
x( +17!) n(o +19") 
o=l 
p-2 
=> x(@ +)7')n(l-@ +7!) 


o=l1 
p-l 
= D> x(n)n =n) 
n=2 
p-1 
7 x(n) nC — n). (38.25) 
n=1 


We have given (38.25) as the definition of the Jacobi sum in (38.16); denote it by 


p-l 
Jixn) = > x(a) 10 — 2). (38.26) 
n=1 
Thus 
JX = ICRU) (38.27) 
g(x) 


To prove (38.24), Eisenstein observed that 


p-l 
ging(x) = dnl) x(nJe 


m,n=1 


2mi(m-+n) 
Pp 


Bet 2ni(o+l)m 
= > nlm) x(mo)e (38.28) 


m,o=l1 
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and that, in (38.28) because x7 was nontrivial, the part of the sum corresponding to 
o = p—1was 


p-l 
S> (xn)(m)x(-1) = 0. (38.29) 


m=1 


He could apply (38.21) to find that the sum in (38.28) for values of o (other than 
p — 1) produced 


p-2 p-l 
g(x) a(n) = >_ x(o)(xn) (6 +1) \xnime P, 
o=1 m=1 


thus verifying (38.24). Next, from (38.22) and (38.27), Eisenstein perceived that when 
x,n, and xn were nontrivial, 


8(X) 8%) aM) a7) 
(xn) 8(Xx7) 


JX JOON = 


(xn)(-lDp 


Eisenstein gave two corollaries of (38.30): for the cases Diss 4f+land p =3f+1. 


2mriv( n) 


For p = 4f +1, he noted that the character x(n) = e “7 was of order 4f, meaning 
that 4 f was the smallest integer such that x4f (n) = 1 foralln € Zp \ {0}. Therefore, 


n(n) = -_ ehGiy = Pe = jn). 
Thus he had, with A and B integers, 
4f 4f 4f 
2 
Jinn) = Yo na) nd =n) = YF nan?) = YP) = At Bi 
n=1 n=1 n=1 
and hence 
p= A? +B’. (38.31) 


Similarly, for p=3 f+ 1, Eisenstein argued that there was a character n of 
order 3. So 


nin) =(e 8 


and 


Sf : 
Jann) =o (e FO" = At Bet 


n=1 
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and he could conclude that 
p= A? —AB+B?. (38.32) 
Note that (38.32) implies that 
4p = (2A — B)? +3B* = (A — 2B)? +3A? = (A+ B)* +3(A — BY. 
Now since one of B, A, or A — B must be divisible by 3, we have 
4p = X?+27Y", (38.33) 


with X and Y integers. 

In his 1827 letter to Gauss, Jacobi mentioned (38.22), saying that it would prove 
the assertion left unproved by Gauss, due to a lack of space, in article 360 of his 
Disquisitiones. Jacobi also noted formulas (38.31) and (38.33) in his letter; Gauss had 
given a lengthy proof of (38.32) in his Disquisitiones. 


38.8 Weil: Solutions of Equations in Finite Fields 


The Gauss and Jacobi sums defined in Section 37.7 were taken on the field Z,, where 
p was a prime. In a paper of 1890,°* L. Stickelberger, a student of Weierstrass and 
Kummer, defined these sums on general finite fields with gq = p™ elements. Now to 
define Gauss sums on an arbitrary finite field with, say, p’”” elements, it is necessary to 
define an additive character on F and a multiplicative character on F* = F \ {O}. We 
define the additive character as a function w in a manner consistent with our definition 
of an extension of F': Let a € F and let the trace Tr of a be 


Tr(a) =atar+aP? +---+a?P 


Note that 
(Tr(a)? =a? +a” +---4+a"" 
= Tr(@), 


because aw?” = a. Now, based on Galois’s theory, developed in Section 37.5, of which 


Gauss was also aware,* since Tr(a) is a solution of x? = x, it follows that Tr(w) € Z p 


It is then not difficult to show that 
Tr: F> Zp 
is an onto map and that 


Tr(a + 8) = Tr(a@) + Tr(). (38.34) 


33 Stickelberger (1890). 
34 Gauss (1981) pp. 616-618. 
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Now set 


2niTr(a) 


Wwia)=e P (38.35) 
so that 


Wat B) = W@)w(B), 


showing v to be additive. Weil took the additive character yw : F — C to be anonzero 
function satisfying 


wWaty=voa)y(). 


Clearly, w(O) = 1 and w(1) = er, where & is an integer. 

Next, to define a multiplicative character on F, note that F* = F \ {0} is a cyclic 
group of order q — 1. Let us here take g to be the generator of F \ {0}. A mapping 
x: F* > (®% that satisfies 


x (aB) = x(@)x (B) 


is called a multiplicative character that is completely determined by its value on g, 
equal to a (¢ — 1)th root of unity. Denote by xo the character whose value at g is 
1; xo represents the trivial character or principal character. The definition of x9 may 
be extended to all F by specifying x9(0) = 1; all nontrivial multiplicative characters 
are extended by taking (0) = 0. Observe that any multiplicative character may be 
written as 


2nis 


x(@)=er!, s=0,1,...,¢—1. (38.36) 


With x as given in (38.35), the Gauss sum belonging to a character x is defined by 
g(x) = D> x(x) Qa). (38.37) 
xeF 


Stickelberger also defined the Jacobi sum, though he called it the Eisenstein sum, 
for a general finite field: 


Jn) = > x@ nl =a). (38.38) 
ack 
In fact, Eisenstein had defined a more general sum for multiplicative characters 
X15 X2> eres Xs: 
Fs X20 Xs = YD xr ar) x2(a2) +++ xs (As) (38.39) 
ayte+as=1 


and applied it to obtain a new proof of the law of biquadratic reciprocity.*> 


35 Bisenstein (1975) vol. 1, pp. 141-163. 
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At the beginning of his long paper, Stickelberger showed that (38.22) and (38.27) 
would hold for a general finite field with g elements: For a nontrivial character x on F, 


a(x)eX) = x(-Da (38.40) 


and for nontrivial x, 7, and xn, 


= 8(x) 8M) 


J(X, 
we) a(xn) 


(38.41) 


Stickelberger’s proofs of (38.40) and (38.41) follow the same lines as the argument 
given by Eisenstein for (38.22) and (38.27). Note that since 


a(x) = x(-1) a@®), 
(38.40) implies that 


IsGOP=¢q or [gal =4q?. (38.42) 


Weil applied Gauss and Jacobi sums to determine the number of solutions in F of 
the equation 


agxg? +ayxy? +--+ +a,x?" = 5, (38.43) 


where ao,a1,...,a, € F* and b € F. In his 1972 lectures on the history of number 
theory, Weil talked about how he was led to this problem:*° 


In 1947, in Chicago, I felt bored and depressed, and, not knowing what to do, I started reading 
Gauss’s two memoirs on biquadratic residues, which I had never read before. The Gaussian 
integers occur in the second paper. The first one deals essentially with the number of solutions of 
equations ax* — by* = | in the prime field modulo p, and with the connection between these and 
certain Gaussian sums; actually the method is exactly the same that is applied in the last section 
of the Disquisitiones to the Gaussian sums of order 3 and the equations a= by? = 1. Then I 
noticed that similar principles can be applied to all equations of the form ax” + bt” +cz" +--+ = 
0, and that this implies the truth of the so-called “Riemann hypothesis” (of which more later) for 
all curves ax” + by” + cz” = 0 over finite fields, and also a “generalized Riemann hypothesis” 
for varieties in projective space with a “diagonal” equation )~ ajx; = (0). This led me in turn to 
conjectures about varieties over finite fields, some of which have been proved later by Dwork, 
Grothendieck, M. Artin, and Lubkin, and some of which are still open. 


Concerning the open conjectures, Weil added an epilogue, received in June 1973:°7 


Reference has been made above to my conjectures of 1948, which included the extension of the 
“Riemann hypothesis” to algebraic varieties of arbitrary dimension over finite fields. 


Those conjectures have now been proved by Deligne. In the meanwhile, he had also shown, 
in conjunction with the work of Ihara, that their truth would imply the truth of Ramanujan’s 
conjecture on the t-function, which has been described above as “very much of an open problem.” 


Number theory is not standing still. 


36 Weil (1974), especially pp. 106 and 110. Weil (1979) vol. II, p. 298. 
37 Weil (1979) vol. II, p. 302. 
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André Weil began his 1949 paper, “Numbers of solutions of equations in finite 


fields”** with some very valuable historical perspective and insight: 


The equations to be considered here are those of the type (38.43). Such equations have an 
interesting history. In art. 358 of the Disquisitiones,?9 Gauss determines the Gaussian sums (the 
so-called cyclotomic periods) of order 3, for a prime of the form p = 3n + 1, and at the same 
time obtains the numbers of solutions for all congruences ax3 = by? = | (mod p). He draws 
attention himself to the elegance of his method, as well as to its wide scope; it is only much 
later, however, viz. in his first memoir on biquadratic residues,*? that he gave in print another 
application of the same method; there he treats the next higher case, finds the number of solutions 
of any congruence ax* — by* = | (mod p), for a prime of the form p = 4n + 1, and derives 
from this the biquadratic character of 2 mod p, this being the ostensible purpose of the whole 
highly ingenious and intricate investigation. As an incidental consequence (“coronidis loco”), he 
also gives in substance the number of solutions of any congruence y + 2 = ax* — b (mod Pp); this 
result includes as a special case the theorem stated as a conjecture (“observatio per inductionem 
facta gravissima’) in the last entry of his Tagebuch;*! and it implies the truth of what has lately 
become known as the Riemann hypothesis, for the function-field defined by that equation over 
the prime field of p elements. 


Gauss’s procedure is wholly elementary, and makes no use of the Gaussian sums, since it is rather 
his purpose to apply it to the determination of such sums. If one tries to apply it to more general 
cases, however, calculations soon become unwieldy, and one realizes the necessity of inverting 
it by taking Gaussian sums as a starting pint. The means for doing so were supplied, as early as 
1827, by Jacobi, in a letter to Gauss.*? But Lebesgue, who in 1837 devoted two papers*3 to the 
case ng = --: = nr of equation (38.43), did not succeed in bringing out any striking result. The 
whole problem seems then to have been forgotten until Hardy and Littlewood found it necessary 
to obtain formulas for the number of solutions of the congruence 5°; x = b (mod p) in their 


work on the singular series for Waring’s problem;*4 they did so by means of Gaussian sums. 
More recently, Davenport and Hasse*> have applied the same method to the case r = 2,b = 0 
of equation (1) as well as to other similar equations; however, as they were chiefly concerned 
with other aspects of the problem, and in particular with its relation to the Riemann hypothesis in 
function-fields, the really elementary character of their treatment does not appear clearly. 


As equations of type (38.43) have again recently been the subject of some discussion,’° it may 


therefore serve a useful purpose to give here a brief but complete exposition of the topic. This 
will contain nothing new, except perhaps in the mode of presentation of the final results, which 
will lead to the statement of some conjectures concerning the numbers of solutions of equations 
over finite fields, and their relation to the topological properties of the varieties defined by the 
corresponding equations over the field of complex numbers. 


To determine the number of solutions in F of (38.43), following Weil,*” suppose that 
(38.43) has only one variable: x” = u, where u € F. Now, letting N(u) be the number 
of solutions of x” = u, and with n | q — 1, we must prove 


38 
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Weil (1949). Weil (1979) vol. 1, p. 399. 

Gauss (1863-1927) vol. 1, pp. 445-449. 

ibid. vol. IL, pp. 67-92. 

ibid. vol. X1, p. 571. 

Jacobi (1969) vol. VIL, pp. 393-400; also see vol. VI, pp. 254-274. 
Lebesegue (1837) and (1838). 

Hardy and Littlewood (1922). 

Davenport and Hasse (1935). 

See, e.g, Hua and Vandiver (1948). 

Weil (1949). 
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Nu)= D> x), (38.44) 
x"=X0 

where the sum is taken over all characters of order d and d | n. 

Next, to prove (38.44), suppose that s = q-1 in (38.36) and x(g) = get in 
that case, x0, x, x’, jhe xr! are the n characters of orders that divide n. If u = 0 in 
(38.44), then N (uw) = 1, x = 0 being the only solution for that case. But the right-hand 
side of (38.44) is also equal to 1, because x9(0) = 1 is the only nonzero term in the 
sum. Again, if uw 4 0 and x” = uw has a solution, then uv = a” for some a € F. Thus, 
in this case, N(u) = n, since the n solutions are a, ba, --- ,b"—!a, where b € F and 
b” = 1. And the right-hand side of (38.44) is also equal to n in this case, because 
x(u) = x(a") = x"(a) = 1, and there must be n such terms in the sum. Taking 
u # 0, if x” = u does not have a solution, then N(u) = 0. Moreover, if g denotes 
the generator of F \ {0}, then u = g*, for some s that does not divide n, and then 
x(u) # 1. Therefore, the right-hand side of (38.44) is also zero, completing Weil’s 
proof. 

We now turn to an equation with two variables: 


ax” +by”"+c=0 (mod q). (38.45) 


In a letter of January 1932 to Hasse, Davenport showed how to solve (38.45) using 
Gauss and Jacobi sums:*8 


My dear Helmut, 
I promised to send you my treatment of the congruence 


(1) ax"™ + by” +c =0 (mod p.) 


Let x1, ---,Xm—1 be the nonprincipal characters for which x” = xo, the principal character. 
It is easily seen that 


T+ x1) +--+ + Xm-10) 


is precisely the number of solutions of x’” = t. Hence the number of solutions of (1) is 


N= DF ++ ams) {1+ Xi (SE) pon (-“=)| 
t 


b b 


where X1,... X,—1 are the n.-p. [nonprincipal] characters for which X” = yg. Hence 


m—1n-1 ape 
N=p+ > Y x00Xs (- ; ); 


r=ls=1 ¢ 
The sums ¢ can be easily expressed in terms of generalized Gaussian sums 


2nix 


TX) = So xe), e(x)=eP. 


48 Hasse and Davenport (2014) p. 19. 
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These have the property 
X(u)t(x) = D> x@)e(ur). 
Hence 
2 Ore +c) =—= aH LHe ((at + c)v)X(v) 
= 0 LHX elev) 
~ 7(X) 
T(x)t (XX) _ 
= x 
r(x) X(a)xXX (Cc). 
Therefore 


m—1n— 


=p oe OE a) 


r=1 s=1 
= p+ /pin—1(n—1) since |r| = /p, [8] <1 
>0 if p> (m—12(n— 1’. 


Quite trivial! 


Here Davenport assumed that both m and n divide p — 1. It is not difficult to prove that 
if d = (m, p—1), then the equation x” = t (mod p) has the same number of solutions 
as x? = t (mod p). He moreover assumed that the product of two characters x; and 
Xj was not the identity; we shall see that this assumption also does not seriously affect 
the calculation of the number of solutions of (38.45). 

Next, to calculate the number of solutions of equation (38.43) in general, Weil let NV 
denote the number of solutions when b = 0. Although Weil did not do so, here assume 
that n; |q—1,i = 0,1, ...,r; as noted before, this restriction does not produce a loss 
of generality. For ifn | p — 1, then we replace n by d = (n, p — 1). Thus 


N= So NG" = u0)N(QX™ = u1)--- NG" = u,). (38.46) 
agug 
Let n;, i = 0,1,...,7, denote any character whose order divides n;, meaning that 
i = xo on F™. Recall that x9 denotes the trivial character. Then (38.46) can be 
rewritten as 
=. Y > no(uo)ni (1) ++ Ur), (38.47) 
N0s-- Ir aguo+::+a-up=0 
where the first sum is taken over all characters n; of order n;,i =0,1,...,r. 
Now if we let t; = ajuj,i = 0,1,...,r, then the sum in (38.47) can then be 
written as 
Y= nolag')m(ay')-+-n-az") YS) nolto) ++ ne (ty). (38.48) 


NOs +++> nr to+--+t-=0 
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Observe that if all characters in the inner sum (38.48) are trivial, then that part of 
the sum is g’, because the first r values in the equation fg + tj +---+¢, = Ocan 
be chosen arbitrarily so that the value of t, is fixed. And if some but not all of those 
characters are trivial, then the sum corresponding to those characters is zero. Thus, for 
example, if no, ...,; are trivial characters and 7j+1,...,7, are all nontrivial, then 
the part of the sum corresponding to these latter characters would take the form 


Yo nH1O Do nj20--- Yo moO 
t t t 


and the value of each of these sums is zero. 

If all characters are nontrivial in the inner sum, and so is the product 0,71 --- nr, 
then the value of that part of the sum in (38.48) is zero. To verify this, observe that in 
the inner sum of (38.48) the condition on ¢; can be written as 


ttt: +t =—to 
Set 
t; = —tos;, i= 1,2, sT, 
so that 
Yo nto) ++ ne (tr) = no(—D YS (non + + me) (—t0) I nis nas © r)- 
tot--+t-=0 to 40 


(38.49) 


But 7071 --- 7, 18 a nontrivial character, so the sum on the right-hand side of (38.49) 
is Zero. 

The final case to be considered is that in which no,71,...7, are nontrivial, but 
non1 +++ Nr 1s trivial. In this case, we get 


no(—1) ¥> (om +++ nr)(—to) = no(-Dq — 1), 


tg40 


so that in this instance, (38.49) takes the form 


Noto) ++ * Mr (tr) = no(—1)(q — 1) J m2, .-- Nr). (38.50) 
tot-+t-=0 


We therefore get Weil’s equation that 


N=q'+no(-D@—1) D> mo@g')---n@p') J@n.n,-..smr), (38.51) 


where n;,i = 0,1, ...,7 are all nontrivial and non) --- 7, = xo, the trivial character. 
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Observe that 1,72....,7, and 4,72 --- 7, are all nontrivial. In this situation, it can 
be shown that 


FG Toca 8 (1) 8 (72) *** BCI) (38.52) 
g(min2*+* Nr) 


an equation that can be demonstrated in exactly the same way as (38.41). Since 
non1*:*Nr = Xo, the trivial character, it follows that n172---n- = igs Now we 


multiply the right-hand side of (38.52) by : om and arrive at 


g(no)g(ni)--- gir) 
g(no)g(ng') 


—1 
= or - g(no) ++ g(r); (38.53) 


J(71,2, +++. Mr) = 


verifying the case in which all characters are nontrivial. Weil, however, proved 
equation (38.53) in a slightly different manner. He used (38.40) to obtain the Fourier 
expansion of a nontrivial character n(x) on F: 


n(x) = ee Yo 70) W(x). (38.54) 


t 


Substituting (38.54) for 71 (to), ... 7 (t-) in the inner sum of (38.48), he simplified 
the resulting expression to derive (38.53) . Weil next applied (38.42) to (38.51) and 
(38.53) to arrive at 


IN-q’|<M@q-Daq™, (38.55) 


where M was defined as the number of nontrivial characters n,; of order n;, i = 0, 
1,...,7 such that non, ---7,- was trivial. He also found a similar inequality for the 
number of solutions of equation (38.43) when b # 0. We note, as did Weil, that Hua 
and Vandiver independently obtained (38.51).*? 

Weil next considered the solutions of the equation 


agxg + ax} +--+ + arx? = 0, (38.56) 
defined in r-dimensional projective space: the set of points (cg,c1,...,c,) where 
cj € F,i = 0,1,...,r and not all cj are zero. Two points (co,c1,...,c;) and 
(do,di,...,d,) are equivalent if there is a nonzero a € F such that c; = adj, 
i=0,1,...,7; this is clearly an equivalence relation. An equivalence class describes a 


point in projective space. It is also clear that the number of points in the r-dimensional 
projective space may be given as 


Z =q'+q fee- + q+. 
q-1 


49 Hua and Vandiver (1949). 
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If N denotes the number of solutions of (38.56) and N the number of solutions in 
the sense of (38.51), then manifestly N = 1+ (¢ — DN. Using (38.51), Weil had 


Naltqt- tat no(-D YD W@d--F-@) Ist), 8.57) 
Moye Mr 


where nj, i = 0,1,...,r are all nontrivial and non, ---7- = xo, the trivial character. 
He denoted by N, the number of solutions of (38.56) in the extension F, of F of 
degree v; he then calculated the series 


CO 
> Nally (38.58) 
v=1 


For this purpose, he needed a result now called the Davenport—Hasse relation. We 
give some definitions required to state the result: Let F be a finite field with g elements 
and let E be a finite extension of F with gq” elements. For a € E, the trace of a from 
E to F is defined by 


Tresr(@) =a+at +--+ at, 
The norm of a from E to F is defined as 
v—-l 


NejF(a@) =a-at--- a4 


Next, if x is a multiplicative character on F, then define x’ on F, by x’ = xo 
Nr, /r. And if y is an additive character on F, the set wi =o Trr,/F. Now if the 
Gauss sum belonging to a character x on F is given by 


8x (x) = >) x@)V@), (38.59) 


xeF 


then we have the corresponding Gauss sum ay belonging to the character x’ on Fy: 


BO = >d_ x’Q)W’O). (38.60) 


yeFy 
The Davenport—Hasse relation™? states 
— 8 = (—8,)”. (38.61) 


We present Weil’s proof of (38.61), since it simpler than that of Davenport and 
Hasse. For each monic polynomial of degree n > 1 with coefficients in K, 


fy SN” 4 eX hp ee cg, 


50 Davenport and Hasse (1935); see formula (0.8). 
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set 
Af) = xen) Wer). 
Let deg f denote the degree of the polynomial f and let 
g(x) =X" 4+ dX"! +++. +m, 
so that it is clear that 


A( fg) = x(cndn) Wer + a1) 
= xX(cn) Wer) Xx (dn) Wi) 
= A(f)A(g). (38.62) 


Set P as an irreducible monic polynomial with coefficients in K and set U as an 
indeterminate. Weil gave the formal identity 


1+ dl acusel =] (i - apy ue?) (38.63) 
f P 


to prove which, first note that every monic polynomial f is uniquely a product of 
irreducible polynomials P. Then using (38.62), the right-hand side of (38.63) can be 
expressed as 


[]@ 44cm use? 42(e2 use?’ 4) 14 SO pues, 
P i. 


Weil observed that the sum on the left-hand side of (38.63) for polynomials f(x) = 
X +c, of degree 1, would be 


S> A(fU = (= row) U 


deg f=1 ceK 


= gyU. 


Moreover, the sum on the left-hand side of (38.63), for monic polynomials of degree 
n > 1, was 


(wy x(n) >~ xe) U. 


Now because }> _ x (Cn) = 0, all the coefficients of U” for n > 1, on the left-hand 
side of (38.63) vanished, so that (38.63) would reduce to 


1+g,U =|] (1—-acpy ue?) (38.64) 
P 
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Similarly, Weil noted that if 
F(X) =X" 4X" 1 $+ 
was a polynomial over K’ = K,, and 
MF) = x" (dn) x'(di), 


and U’ denoted an indeterminate, then 


\-1 
1+g),U' =] [(l-acpyureP)™, (38.65) 
Pp’ 
where the product was taken over all irreducible polynomials P’ on K’ = K,. Weil 


took P as a monic irreducible polynomial over K and set P’ as an irreducible factor 
of P in K’. He proved that 


N(P') = A(P)2. (38.66) 


To prove (38.66), Weil let —€ be a root of P’ and noted that K (€) was an extension 
of K of degree = deg P = m, while K’(&) was an extension of K’ of degree = 
deg P’ = m’. Since K’(&) was the smallest field containing K (&) and K’, its degree 
over K was the l.c.m. of m and v and was thus equal to “7, where d = g.c.d. (m,v). 


This implied that m’ = ; thus, P had d irreducible factors, each of degree 4. Taking 


a and b to be the norm and trace, respectively, of € from K (&) to K, Weil had 
POLI OXT Bese 

and 
M(P) = x(a) (bd). 


Recall that Weil took —é to be a root of P(x). Similarly, with a’ and b’ denoting 
the norm and trace, respectively, of € from K'(§) to K’, and where N = Nx /K> 
T = Trx’/xK, he arrived at 


AP!) = x'(a‘) Wb) = x(Na’) W(Tb’). 
Finally, Weil completed the proof of (38.66) by observing that 


Na! = Nxi/x(Nxv@x'()) = Nee@/K (Ne@/K@E)) 
= Nxeyx(&4) =a4 


and, similarly, 


Th sb. 
d 
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After changing U’ to U”, relation (38.66) produced the factor on the right-hand 
side of (38.65): 


(l-awpyur) 


Since each of the d irreducible factors of P produced the same factor, the 
contribution by P to the right-hand side of (38.65) was 


(l—-acpyur)~, (38.67) 


and Weil noted that this could be written as 


v-l 


T] @-aMewry 


p=0 


with ¢ being any primitive vth root of unity. This is not difficult to prove; although 
Weil omitted the argument, we offer one here: Let v = dt so that we have to show that 


v-l 
(1—acpy' um)? = T] a—- ace) ©”). 
p=0 
Let ¢ = ey. Then 
Gee — Pa — gm 


and 


v=1 o EA ed 21 
[Jq-e 7 amu") = TJa-e"F acpyu”) TT a-e 
p=0 p=0 p=t 

= (1—2(P)'U™)(L — A(Py'UM™) « 


= (1 -a(Pyiuny?, 


2nipm! 


IE Ves 


Equation (38.65) has thus been transformed into 


v-l 

1-(-g',)U” =T] J] a -acceruye?) 
pau f 
v-l 


= |[[d+e,¢°v) 
p=0 


v-l 


=|] -¢?@8,0)) 


=1-—(-g,)"U", (38.68) 
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proving the Davenport—Hasse relation. A slightly simpler version of this proof of Weil, 
due to Paul Monsky, is given by Ireland and Rosen.>! The simplification is brought 
about by proving 


Bx! = >. (deg P)ACP)#?, 


where the sum is taken over all monic irreducible P whose degree divides v. With this 
result in hand, take the logarithmic derivative of (38.64) and multiply by U to arrive at 


_ -r X(P) (deg P)U*E? 


2S 1 —A(P)U4s P 


or, after expanding as an infinite series, 


Yop’ gruv => (> (deg paceyurse? 


P r=1 


Now equate the coefficients of U” to obtain the Davenport—Hasse formula. 


Weil took the D-H formula, together with (38.53) and (38.57) to conclude that NV, 
given in (38.58), could be expressed as 


v(r—1) 


rl Gar Sie 2 ; 
=) ay F (aa) TeC@r) g(00)°*° 8017) 


~ =e 
= Y> = los y+ 0" 
= ra dU 
(irr! _ _ 
x = log (1 = (aa) Te@r) (20) 80 ) 
N0>++-Nr 
d 
= ay 108 ZU). 


Note that here we have taken 


P(U)-Y" 


ZU) = = 
(1 —U)(1 —qU)---(1—q"1U) 


and 


(1 r+l1 


PUu)= |] (1- 


N9(ao) ++ 7,(ar) 8(No) --- gr) u) ; 
10. 715--+5r 


5! Treland and Rosen (1982) pp. 164-165. 


420 Finite Fields 

Recall that all the n;,i = 0,1,...,7r, are nontrivial, while their product, non --- ,, 
is trivial. 

This result was among those that lead André Weil at the end of his 1949 paper to 
formulate the famous Weil’s conjectures. 


38.9 Exercises 


(1) Let P be a monic irreducible polynomial in R = Zp[x], and let | P| denote the 
number of elements in z. Set 


er(s) =[Ja- iPr), 


where the product is taken over all monic irreducible polynomials in R. 
Determine ¢r(s) and compare your result with equation (38.4). 
(2) The last entry in Gauss’s diary, dated July 9, 1814, reads (in translation): 


I have made by induction the most important observation that connects the theory of 
biquadratic residues most elegantly with the lemniscatic functions. Suppose a + bi is a 
prime number, a—1+ bi divisible by 2+2i, the number of all solutions to the congruence 


l=xx+yy+xxyy (moda-+ bi), 


including x = 00, y= +i, x =+1, y=oo is =(a—1)*+bb. 


Prove Gauss’s theorem. Note that the diary was discovered in 1897 and 
published in 1903. See Ireland and Rosen (1982) pp. 166-168, where a proof 
using Gauss and Jacobi sums is given. In 1921, Herglotz gave the first proof 
of Gauss’s last entry by using complex multiplication of elliptic functions. 
Chapter 10 of Lemmermeyer (2000) gives an excellent discussion of this topic, 
including useful historical notes. See also Weil (1979) vol. 3, p. 298, for some 
perceptive historical remarks, pointing out the connection between Gauss’s 
diary entry and lemniscatic function. 


(3) Let p be an odd prime, and let Q, R be irreducible polynomials of degrees 
and p in Z,p[x]. With f any polynomial in this ring, let (5) denote the unique 
element of Z7, such that 

-pnj2_ (Ff 

fie W2 af 2 (mod Q). 
Q 
Show that 
(a)(a)-(G) 

Q/\R py 
See Dedekind (1930) vol. I, pp. 56-59, for a proof of this analog of the law of 
quadratic reciprocity. 

(4) Generalize the Euler totient function ¢ to the ring Z,[x]; state and prove 


a formula analogous to ¢é(m) = m( — nw) (Ud - Fal where p1,..-,Dk 
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comprise all the distinct prime factors of the positive integer m. See Dedekind 
(1930) vol. I, pp. 50-51. 

(5) State a generalization of Dirichlet’s theorem on primes in an arithmetic 
progression to the ring F,[x], where Fz is a finite field with g = p” elements, 
p a prime. Rosen (2002) offers a statement and a proof of this theorem and a 
reference to Kornblum’s paper. Compare Rosen’s proof with that of Kornblum. 

(6) Let g = p" and ao, a}, ...,a, be nonzero elements of Fy. Let no, m1, ...,n; be 
positive integers, and let d; denote the greatest common divisor of q — 1 and 
n;. Let Nj represent the number of solutions in Fy of the equation 


aoxo° ayx;! te. ta,.x' +1=0. 
Prove that 
INi1—4"l < (do — D+ — Dq?. 


See Weil (1979) vol. I, pp. 399-410. On the basis of some of his earlier 
theorems and this result, Weil made four conjectures for zeta functions of 
smooth projective varieties over a finite base field; see pp. 409-410 for a 
statement of these conjectures, one of which was the Riemann hypothesis. 
Define the Ramanujan t(n) function by the formula 


—, 


(7 


oe) 


q| [ad -4")* =o rin)q”. 
n=1 


n=1 
Assuming the convergence of the series and the product, show that 


oe) 


Dern =P] (1-ep) p + wh), 


n=1 Pp 


where the product is over all the primes. This result was conjectured by 
Ramanujan and proved in 1917 by Louis J. Mordell (1888-1972). See Hardy 
(1978) pp. 161-165. Ramanujan also conjectured that |t(p)| < 2p. This was 
deduced by Pierre Deligne from his 1974 proof of the characteristic p Riemann 
hypothesis. 


38.10 Notes on the Literature 


Gauss (1965) is an English translation by Arthur A. Clarke of Gauss’s Disquisitiones 
Arithmeticae; the quotations in the text are taken from this translation. Due to 
space considerations, Gauss was unable to include in this work an eighth section on 
polynomials with coefficients from a finite field. A German translation of this section 
is available in Gauss (1981) pp. 589-629. Giinther Frei (2007) gives an excellent 
treatment in English of Gauss’s researches on finite fields. Peter Roquette’s 2018 The 
Riemann Hypothesis in Characteristic p in Historical Perspective presents the work 
on this topic from Artin in 1921 to Weil in 1948. 
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